-1) Currently we call is_prepared() to check whether or not to call spmt_parent_enter_method() in an invoke. There is a race condition here: T1: is_prepared() --> false T2: PREPARE_METHOD T1: never executes PREPARE_METHOD It is unlikely, but it is a race condition nonetheless. The way to fix it is to have a new instruction at the beginning of each method. For LINK_NATIVE_METHOD, just have the call to spmt_parent_enter_method() in NATIVE_(NON)STATIC_METHOD... by that time, there won't be any need for the is_prepared() check + enter() in INVOKE anyway. 0) A SableVM magic word in object headers is used to detect references: This is unsafe because you might end up with a stale SableVM magic word in something that's not an object, and consequently do something bad like deference a pointer in something that's not a vtable... 1) Summary: Next context prediction is made in SPMT_JOIN. Description: The context prediction for a callsite is made when the parent returns to that callsite, i.e. it is an advance prediction. Solution: The context prediction should be made when the child actually starts up. This moves some of the cost of predictor updates out of the parent and into the child. spmt.prediction.defer.to.child In fact, we'll always do the context prediction in the child, so there is no need for this property. There is always less overhead on the parent this way. 2) Summary: Copy global context to child at fork time. Description: We might want to keep a copy of the context at fork time in the child. This might help prevent against problems with asynchronous updates to the global context between fork and prediction time. Solution: This could be expensive in terms of memory, and we might not see any benefit (it shifts the async update problem around but doesn't eliminate it), so we'll do it via a new property: spmt.prediction.copy.fork.context Notes: Things get more complicated when considering children of children, a.k.a. nested speculation. We'll just do the simplest thing: if children are receiving copies of a forking context, always copy the global one, and upon returning to a speculative fork point, always update the global context. 3) Summary: Predictors are updated in SPMT_JOIN even without prediction. Description: The only guard in SPMT_JOIN at the moment is not to update predictors if the current virtual is speculative. In fact, there may be more fine-grained requirements with respect to predictor updates. Solution: Children will never fork children and then join them, at least in the current design. However, we might still allow for child updates of history and predictors at speculative fork points. In general, should we enable history and predictor updates upon returning to fork points? speculative | fork point --- history update | predictor update no no maybe (N) maybe (N) no yes yes yes yes no maybe (N) maybe (N) yes yes maybe (N) maybe (N) The new defaults are shown in brackets. IIRC, some people have proposed an update queue to deal with the problem of speculative updates -- might not be such a bad idea. So: *.update.predictor --> some previous prediction *.no.fork.update.predictor --> !spmt.prediction.defer.to.child *.update.predictor --> *.update.history (maybe, maybe not -- consider memoization) spmt.prediction.non.spec.no.fork.update.history spmt.prediction.non.spec.no.fork.update.predictor spmt.prediction.spec.no.fork.update.history spmt.prediction.spec.no.fork.update.predictor spmt.prediction.spec.fork.update.history spmt.prediction.spec.fork.update.predictor Notes: In some cases, updating predictors may yield some extra accuracy. However, the extra overhead might well outweigh the benefits. We can't really justify updating the predictor if we didn't fork a child, unless a dead prediction was made by the non-forking parent. This is probably more true for the table-based predictors (CON and MEM) than the LST, STR, 2DS, and PAR predictors. 4) Summary: Last returned entry in table-based predictors might get updated before SPMT_JOIN. Description: Parent virtuals might not be updating the right slot in the table-based predictors. Although we can determine whether individual predictions were correct or not, the corresponding entry used in the tables is not kept. Solution: When making context and memoization predictions, keep pointers to the last table entries used in the spmt_child (i.e. _svmt_JNIEnv) struct. Update these entries on join/fail. spmt.prediction.remember.child.table.entries 5) Summary: Forced failures leave orphaned predictions. Description: For forced failures away from join points, i.e. parental exception handling failures and recursive failures in nested speculation, the table entries for predictions won't get updated, even though they've been created. Solution: Rehashing complicates things. If rehashing is taking place, these entries can't really be deleted, since finding a null entry will prevent further rehashing. Instead they could be marked as invalid so that they don't get propagated when the table expands: spmt.prediction.mark.invalid.table.entries If rehashing is not taking place, we can just delete the entries. 6) Summary: Predictor data is shared but not volatile or always written synchronously. Description: Predictors are modified by both speculative and non-speculative virtuals asynchronously. Solution: Consider moving predictor changes into critical sections, and declaring more things volatile. Requires a thorough audit of predictor code. Notes: Fixing this problem is expected to increase accuracy at the expense of speed. It doesn't affect safety. 8) Summary: GC invalidation of MEM tables ignores PD results. Description: There is an option to invalidate memoization tables that use a reference arg as input to the hash function. However, if the PD analysis for a callsite is identifying all reference parameters as independent and none are actually being used, then there shouldn't be any invalidation of the memoization table after GC. Solution: Iterate through PD parameters in method signature and look for reference args. If there aren't any, the predictor->has_reference_arg field should not be set to JNI_TRUE. Do this only once at method/predictor preparation time. Notes: Currently we aren't invalidating tables at all. 10) Summary: Asynchronous recursive failure might cause nested children to buffer stack frames unsafely. Description: Nested speculation will see children attempting to buffer stack frames from their ancestors. However, if an asynchronous recursive failure is taking place, the child might still be running when actually it should be dead. Illegal buffering attempts could be made using either speculative or non-speculative ancestor stacks; the whole point of asynchronous recursive child failure is to prevent the parent from wasting too much time at a join point. Solution: It appears that the only real solution is to check on stack buffering that a route can be traced from the current child all the way to the non-speculative parent without encountering any dead children. An alternative would be to free ancestor stacks when backing out of the recursion, thereby ensuring there is always something for the child to copy. 11) Summary: Predictor accuracy is updated outside of a critical section. Description: _svmf_predictor_update_accuracy() is called outside of a spinlock critical section. This is affecting accuracy, although the extent is not known. The assertion: assert (accuracy_count >= 0 && accuracy_count <= 32); is failing for mtrt. Solution: Experiment with moving accuracy updates into a critical section and see if makes any difference to performance. 12) Summary: Predictor critical sections may be too big. Description: Currently predictors are synchronized using rather broad critical sections in spmt_instructions.m4.c. For the sake of safety, only the context and memoizaton predictor operations really need to be synchronized. Solution: Experiment with finer-grained RVP critical sections, see if they make any difference. spmt.prediction.fine.grained.spinlock Notes: Could also attempt various other things, like alternating access to memoization and context predictions, or not forking children at callsites if the predictor is already taken by another native (helper, in this case) or even the current helper but with a different child.