/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * This file is part of SableVM. * * See the file "LICENSE" for Copyright information and the * * terms and conditions for copying, distribution and * * modification of SableVM. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */ ================================ Files in sablevm/src/libsablevm/ ================================ ### TODO: needs updating Here we describe the core files in sablevm/src/libsablem/ that come together to make the libsablevm.so file at compile time. Several of these files are written using m4 macros, to save space and reduce code duplication. It is strongly recommended that you learn to read and write m4 following existing examples, although you can look at the generated files if you want pure C. Files generated from m4 macros are not included the following list. When you start a new project, first go through the following list of files and make a condensed list of those that you really need to look at. bootstrap.m4.c: get ready to create java/lang/Class instances create a whole bunch of fundamental classes (lots of exception classes) buffer.m4.c: buffer.m4.h: dependence buffer for speculative execution (not released yet) cast.list: list of all macro supported casts possible within SableVM C source code ( _svmt_cp_info, _svmt_attribute_info, _svmt_type_info, jobject, jarray, _svmt_native_ref, etc.) cast.m4.c: macro m4svm_cast($1,$2,$3) to cast $3 to $2 cl_alloc.list: list of method declarations for class memory allocation cl_alloc.m4.c: cl_alloc.m4.h: macros to implement methods declared in cl_alloc.list class_file_parser.h: short C preprocessor macro to parse class files. not really sure why this is necessary ... Etienne says it may be obsolete. class_loader.c: class loading, including that done during bootstrapping and invoked by the user code class_loader_memory_manager.c: class_loader_memory_manager.h: the class loader has its own memory manager (see Ch. 3 of thesis) constants.h: constants defined within SableVM for: type state array types constant pool tags access flags types signal codes 2 specific constants interpreter flags stack map types thread status instructions ---------- FROM THE sablevm-developer LIST ------------ >> all constants defined should go into constants.h and this should be a >> policy for anyone modifying the source. i'm not sure if there are >> constants defined elsewhere in the VM. Except for system-specific things, of course. ------------------------------------------------------- direct_threaded.m4: single line macro definition (why is this file necessary?) error.c: code for using signal handlers to throw exceptions error.list: list of exceptions and errors handled by SableVM error_bits.m4.h: error_classes.m4.h: error_init_methods.m4.h: error_instances.m4.h: very short multicallable macros to declare error types. not sure why all four of these files exist ... they are very similar and all declare macros of the same name. ---------- FROM THE sablevm-developer LIST ------------ >> error_bits.m4.h: >> error_classes.m4.h: >> error_init_methods.m4.h: >> error_instances.m4.h: >> very short multicallable macros to declare error types. >> not sure why all four of these files exist ... they are >> very similar and all declare macros of the same name. That's the idea: They declare macros of the same name, yet expand differenlty. Look into src/libsablevm/Makefile.am again. ------------------------------------------------------- error_throwing.c: macro to implement throwing specific errors fatal.c: fatal.h: code to throw fatal errors that cause abortion of the VM gc_copying.c: semi-space Cheney copying collector gc_generational.c: currently broken generational GC ... Etienne suggested getting rid of it altogether (it will still exist in the repository if you "delete" it). gc_none.c: no GC memory management (just allocate heap and operate until it is full). global_alloc.list: list of method declarations for global allocations global_alloc.m4.c: global_alloc.m4.h: macros to implement malloc, calloc, and free methods as specified in global_alloc.list global_refs.c: global_refs.h: code to create and free native globals (in what context?). requires synchronization on vm->global_mutex. ---------- FROM THE sablevm-developer LIST ------------ >> I need a better explanation for global_refs.* and local_refs.* The idea is that you get type-safe allocation and free functions. In addition, the free function resets the pointer variable to NULL, making sure any subsequent attempt to dereference the freed pointer causes a segfault (instead of using an obsolete pointer value, which can be very, very difficult to track). Also, these functions do all the necessary to throw an exception in case of out-of-memory situation. Being type-safe, you know that the allocated memory is of the right size. Once you'll have accumulated a lot C development experience, you'll know how easy it is to make the following mistake, when you copy/paste code: sometype1 *var = malloc (sizeof(sometype2)); And how easy it is to forget to check whether var == NULL. You'll also notice how it is annoying to retype the correct error handling code. The global_refs do all of the tedious work for you, so you get to only write: if (_svmm_gzalloc_gc_map_node (env, method->parameters_gc_map) != JNI_OK) { return JNI_ERR; } As for the ..._no_exception version, they do the same, but they do not create and throw an OutOfMemoryError. They are necessary, because in early bootstrapping, the VM has not yet created the heap, so it cannot instantiate an Error object instance. ------------------------------------------------------- heap_manager.c: code to get object hashcodes and include GC's. it seems that the code at the end of this file is out of date, if not the whole file ... Etienne says "Possible". heap_manager.h: empty file essentially ... Etienne says "Probably". initialization.c: class initialization code. inlined_threaded.m4: single line macro definition (why is this file necessary?) instructions.m4.c: implementations of all bytecodes instructions_preparation.m4: single line macro definition (why is this file necessary?) instructions_preparation.m4.c: macros for preparing switch-, direct-, and inlined-threaded instructions instructions_switch.m4: single line macro definition (why is this file necessary?) ---------- FROM THE sablevm-developer LIST ------------ >> I don't know why we need direct_threaded.m4, inlined_threaded.m4, >> switch_threaded.m4, >> instructions_preparation.m4, and instructions_switch.m4. In general, I >> think the very short files like these are unnecessary. They are necessary. Look in src/libsablevm/Makefile.am. These file impact the selection of the appropriate macro expansion. ------------------------------------------------------- instructions_switch.m4.c another switch-threaded macro, confused about this file. ---------- FROM THE sablevm-developer LIST ------------ >> I don't know why instructions_switch.m4.c exists, it seems similar >> to the content in instructions_preparation.m4.c and there are no files >> called instructions_inlined.m4.c, or instructions_direct.m4.c. It is necessary. Unlike the direct/threaded threaded engines, the switch-based interpreter has to provide a real "switch" statement of bytecode implementations to be included in _svmf_interpreter() [interpreter.c], yet it does need also a separate "switch" (like the two other engines) to provide information about each bytecode (in prevision of method preparation [prepare_code.c]). The key to understand what is happening is to trace the execution of _svmf_interpreter using a debugger (I suggest DDD) by setting breakpoints at the appropriate locations. For this, you need to compile the 3 engines, ideally with --enable debugging-features, unless you want to be intrigued by the segfault used in the normal control flow of the VM... ------------------------------------------------------- interpreter.c: interpreter.h: the main interpreter engine that can be compiled to run in switch-, direct-, or inlined-threaded modes. invoke_interface.c: invoke_interface.h: JNI methods to create and destroy VM's, with AttachCurrentThread, DetachCurrentThread, and GetEnv still unimplemented. java_lang_*: JNI methods for java.lang.* ... some of these are fundamental to VM operation, others are not and remain to be implemented. ---------- FROM THE sablevm-developer LIST ------------ >> move the java_lang_* stuff into libsablevm/java and remove vmlib.* >> altogether. No. The Auto* tools don't really support compilation of a single library out of files in multiple directories. If you want the C optimizer to do a good job, you want to compile a single library in one shot. I never understood why people do a per file compilation of C code to .o. It makes no sense; C wasn't designed like that. An optimizing compiler cannot inline functions or do any global optimization if you arbitrarily separate your compilation unit in smaller units on a source file boundary. It only makes snese to generate a .(s?)o if you plan to reuse the same functionality across many executable/libraries, and dont want your optimizer to do any cross function boundary optimization. libsablevm.so should only export a restricted set of symbols: JNI_* and Java_*. Using more than one compilation unit would cause the exportation of additional (read: inteternal!) symbols. So, unless you want to fix the GNU auto* tools, you'll have to live with the current file structure. ------------------------------------------------------- jnidefs.h: definitions of jobject, jarray, jfieldID, and jmethodID. why do these get their own special file, instead of being included in types.h or jni.h? Etienne's answer: Because jni.h is installed on the users system and made available for Java programmers to be able to write JNI libraries to link with the VM. The types above are "opaque" types, from a Java programmer's point of view, so that the same JNI library can wotk with *any* JVM implementing the JNI interface. Of course, within SableVM itself, these types should't be opaque, so there you have jnidefs.h to "enlighten" these types. lib_init.c: not really sure, seems to be only for calling lt_dlinit(), and ensuring that it only gets called once. Etienne's answer: You can invoke JNI_CreateJavaVM multiple times (concurrently) within a single process (as long as each call is on a different thread). lib_init.c serves to make sure some global libsablevm initialization happens only once, using the standard POSIX pthread_once(). libsablevm.c: big include file for compilation to a single object (see Makefile.am) ---------- FROM THE sablevm-developer LIST ------------ >> all #include directives should go in libsablevm.c except for pthread_rec*, and some other ones like in _svmf_interpreter(). ------------------------------------------------------- link.c: link.h: class, array, and type linking. local_refs.c: local_refs.h: similar to global_refs.c but for locals. macros.c: casting functions and _svmf_is_same_object. not really sure why these cast functions are necessary, can't we generate them using cast.list and cast.m4.c ... maybe this file is obsolete. macros.h: some C preprocessor macros. seems like these are probably obsolete given the widespread use of m4 now. ---------- FROM THE sablevm-developer LIST ------------ >> it seems that everything in macros.c should go in cast.list, except >> for _svmf_is_same_object which should go in one of the util files. I'm >> not sure if macros.c is obsolete or not. It seems that macros.h is >> probably obsolete as well, and if not, should probably be targeted for >> replacement with m4 macros in the future. You are mostly right. A couple of methods cannot be generated using cast.list because of the additional conditional. Also, the name "macros.[ch]" is not really intuitive. It is part of a few legacy things from before the SableVM rearchitecturing to use m4... ------------------------------------------------------- macros.m4: basic m4 macros for use within the rest of the VM. learning how m4 works is a good idea if you want to understand SableVM. Makefile.am: input file for automake method_invoke.list: declarations of VM-critical nonvirtual and static methods. not sure what the "specific" methods are all about. ---------- FROM THE sablevm-developer LIST ------------ >> unclear about "specific" methods in method_invoke.list The non-"specific" method invocations are invocation of method decalred in classes which are automatically loaded by the bootstrap class loader. The "specific" method invocations are used to invoke a "specific" version of a method signature selected at runtime. So, in the "specific" case, you have an additional formal parameter for providing a _svmt_method_info pointer. For example, to invoke the method of a class (part of class initialization), the VM needs the _svmt_method_info which is specific to that class. It cannot use a generic _svmt_method_info from method/class loaded at bootstrap time. If you don't like the "specific" name, (I'm starting to dislike it quite a bit), I'm open to suggestions for a replacement. ------------------------------------------------------- method_invoke.m4.c: method_invoke.m4.h: macros containing the bodies of methods declared in method_invoke.list native.c: native.h: code to enable native method execution in SableVM native_interface.c: implementation of all JNI methods native_interface.h: very small file defining the JNINativeInterface extern. ---------- FROM THE sablevm-developer LIST ------------ >> do you need native_interface.h? >> These are remnants of the old SableVM code base where I was using separate sompilation units (as is most intuitive with auto* tools). So, unless the "extern" needs a forward declaration, we could maybe get rid of it. [Order of inclusion in libsablevm.c can be a good technical reason to keep a .h file. :-)] ------------------------------------------------------- new_instance.c: new_instance.h: methods to create objects and arrays. predictor.m4.c: predictor.m4.h: value predictor for speculative execution (not released yet) prepare.c: prepare.h: prepare interfaces and classes, but not method bodies. prepare_code.c: prepare_code.h: prepare method bodies (see Ch.2 of thesis) pthread_rec_svm.c: pthread_rec_svm.h: implementation of recursive locks on top of pthread ---------- FROM THE sablevm-developer LIST ------------ >> all type declarations should go in types.h .... this should be a >> policy like for constants.h ... the stuff in system.h obviously >> shouldn't be in there though. % grep typedef * shows that only >> pthread_rec_svm.* violates this. pthread_rec_svm.* should remain so. It is an implementation of recusrive mutexes on top of POSIX non-recursive mutexes. These files can easily be reused by other applications, with no dependency whatsoever on other SableVM internal data structures & types. This could go into a seprate directory, yet I prefer to keep more opimization opportunities by leaving them as part of the same compilation unit. ------------------------------------------------------- resolve.c: resolve.h: code to resolve methods, classes, and interfaces when found in bytecode during execution. splay_tree.list: list of splay tree kinds used within SableVM (there are 4, one each for types, gc_maps, sequences, and interface_method_signatures). splay_tree.m4.c: macros to generate splay tree code for the kinds listed in splay_tree.list: switch_threaded.m4 single line macro definition (why is this file necessary?) system.c system specific implementations system.h system specific header thread.c: thread.h: thread management functions (initialization, stopping for GC, etc.) types.h: This file is really at the heart of SableVM. It defines the data structures used in SableVM. When looking at the VM for a first time, you should spend some time reading this file. ---------- FROM THE sablevm-developer LIST ------------ util.h: a few short utility C preprocessor macros ... not sure if these are defined elsewhere or not. for example, it seems that the memset() wrapper should go into global_alloc.m4.c with the other memory management stuff util.m4.c: some put/get macros. seems like they complement those in util1.c but I'm not sure. util1.c: various utilities for working with C code and for getting stack traces. not sure why the put/get_BOOLEAN/REFERENCE_field/static methods are in here ... util2.c: more utility code, this time mostly for working with Java data structures. seems to me like the Java stuff should go in one util file and the C stuff in another. >> util.h, util.m4.c., util1.c, and util2.c need to be cleaned up (see >> my comments in the file). split >> the util functions into two categories: those for working with the C >> language, and those for >> working with Java data structures. furthermore, util.h could be built >> from an m4 file and a list file it >> seems. Patches are welcome. The reason it is not all in a single file is because of compile dependencies (order of #include's). ------------------------------------------------------- verifier.c: verifier.h: bytecode verifier (not fully implemented). XS vm_args.m4.c: macros to handle arguments to the VM from the command line (or from whoever creates it?) vmlib.c: vmlib.h: some JNI methods ... not sure if this file is perhaps obsolete (or at least part of it, because it seems some java.lang.* stuff is in here. the header file is empty.