|
fixed Map and PMap with oType=void (pmap with oType=void would not work because of some bad declarations).
erwin-cgen: allow cast_*() functions to be dynamic
fixed operator== etc. to have 'const' qualifier. Map and Vector data structures templates were affected. This bug could lead to very strange behaviour when you expected the special == to be used, but instead the compiler used the default byte-wise == for the struct. Very ugly bug indeed.
prepare for cmake: the GNU attributes are now enabled in defs.h without the need for configure. This is a bit more dangerous or better, less generic, wrt. other architecturs and compilers, but untested compilers/flag combinations/architectures won't work anyway with a very high probability...
Added CMakeLists.txt to erwin subdir. It's very basic now, but it compiles libraries. Dunno how to tell cmake which order the object files have to be linked in. For Unix, we used lorder+tsort earlier, which cmake seems to ignore. I have no idea how that is supposed to work. Further investigation is necessary.
There is currently no support for listing rules, so I also don't know how to include the Erwin sub-directory in own libraries easily. Before cmake, the list rule was used to include the object files in a higher level library, but I don't know how that works now.
Made many Perl scripts independent from the Unix shell environment and its standard commands so that they are more likely it work under other OSes (thanks to Christoph Cullmann). This is a prerequisite for CMake integration.
Added erwin_prev_power2().
Added erwin_trailing_0s() and erwin_trailing_0s_non0() (the ntz() function from Hacker's Delight).
Corrected the comparison function for unsigned types, which were all wrong for types with size > sizeof(int). This was probably not as bad as it sounds since they did define some order, but the signed one, not the unsigned one. Derived equality still worked, and hash tables and any arrays depending on a complete ordering also worked. Anyway, this was quite a serious bug.
Documentation extraction was fixed for some accessor functions.
Removed some debug code (#defines) that were not supposed to be there: every Erwin lib accidentally had signature checking enabled, making the structures larger and the code slower. This was not supposed to be so, but only on developper's request.
Added ERWIN_XCHG macro for simple types and Map_xchg() for maps. For lists, it is still missing (not much work, but not yet done).
A change a few months ago silently broke gcc-3.0..gcc-3.3 compilation with -O. It should now be ok again.
Added voidp_hash() in analogy with other TYPE_hash() functions. Unfortunately, a badly named function hash_voidp() existed already, which does not conform with the naming conventions, and we have to keep it know for compatibility.
Fixed a type-punned pointer problem in an assertion and checked that no other cases of this type exist. But this is really hard to track down when the compiler does not warn...
Fixed several parser bugs in erwin-cgen.
Use more GCC _attribute_(()) for better optimisations and warnings.
The memory allocation interface now supports realloc/recalloc with passing of the old size. This might be needed by some memory managers, especially for recalloc, and Erwin structures know the old size anyway. See defs.h for details about the implementation. (Well, currently recalloc is not used by Erwin structures, but maybe some day it will be used...)
Further, we distinguish atomic and non-atomic arrays, i.e., those that are guaranteed not to contain pointers and all others.
the memory allocation interface was changed incompatibly, because it was wrong and could not be repaired without breaking compatibility. Sorry for that. See defs.h for more details.
Improved initial_size handling in Maps: the 'initial_size' is now the expected number of elements, not the raw hash size. This is because most people probably would not want to know a good hash size, but do know the number of elements.
Improved space performance of copy constructors
Added function Map_expect_size() to realloc to an expected number of elements.
When the library is compiled for mixed C and C++ usage, the internal memory management is mapped to malloc/free now instead of new/delete. This is because _as_array_detach() must be usable from C code, too, where only free() is available.
Existing code might need adjustment to please tools like valgrind. In some platforms, using delete after malloc() might cause problems, so this is actually an API change. A am sorry for this. But the old implementation was just wrong.
To make this work properly, most memory allocating functions now have a deallocation function (like Map_get_entries + Map_delete_entries).
The inline assembly on x86_64 and i386 targets was improved. Or better, it was corrected: in some cases, Global_erwin_count_bits would fail to compile.
It was also improved by allowing the compiler more choices of input locations.
Added Global_ERWIN_IS_POWER2, Global_erwin_is_power2, Global_erwin_next_power2, Global_erwin_next_power2_minus1
Vectors on 64 bit machines do not waste memory anymore by using size_t element counters internally by default when the API uses only int anyway.
By this, default (CONSTANT_ZERO) vectors on 64 bit machines are now 16 bytes (8 bytes table pointer + 4 bytes element count + 4 bytes table size), which is nicely aligned.
Vectors may be switched to size_t element counts, switching both the implementation and the API (option Vector_LARGE_INDEX). This is nice on 64 bit machines if you need vectors with > 2G entries.
The vector size is then 24 bytes (8 + 8 + 8), which might lead to some waste in memory allocation.
Note that the API change is incompatible. Use Vector_cnt_t and Vector_index_t instead of int or ssize_t if you need compilation for both API variants.
Via the new Vector_SMALL_SIZE option, vectors may be switched to 'short' element counts internally, leaving the API compabibly using 'int'. The normal vector then only has 8 bytes (4 + 2 + 2) instead of a somewhat unaligned 12 bytes (4 + 4 + 4).
Mainly for machines where an 'int' is 8 bytes, there is a new option Vector_MEDIUM_SIZE that tries to switch the internal element count to some 32 bit type, but leaving the API compatible by using 'int'.
I don not know whether this will often be used, but it was only a few additional lines, so there it is.
Thanks to ingmar's debugging, vectors now support the LOW_MEM option. (Later we found that another few things were missing, but now I think it does work. Note that LOW_MEM is not well tested, however.)
This makes it possible to have 8 byte vectors (4 + 4) on 32 bits and 16 byte vectors (8 + 8) on 64 bits (thus both perfectly aligned) with the maximally possible number of elements (size_t if necessary). A vector then only consists of a pointer and an element count. The table size is removed and instead inferred from the element count. (A similar effect can be achieved by using Vector_MEDIUM_SIZE on 64 bit machines.)
Note that in some (somewhat strange) cases, runtime behaviour is computationally worse (O(n^2) instead of O(n) for n append/chop operations): a vector of size 2^n-1 will reallocate each time in a sequence of append, chop, append, chop, ...
Removed some style sheet options in documentation processing when an HTML template is used: the user probably wants to be able to define the colours manually.
added Map_remove_if (like Map_erase_if, but does not free the elements)
added Vector_init, Vector_init_with_initial_size, Vector_init_with_zero_and_initial_size and Vector_destroy for non-heap vectors.
optimised Map_rehash for faster operation with less alloc/free/realloc.
added Vector_find_ptr and Vector_rfind_ptr
extended classdef/classundef/newdelete to provide fake reference counting to keep APIs stable
List_forall_reverse fixed
reprogrammed the Map_rehash so that the list cells are not de- and reallocated but instead be taken from the old table to the new one. This improves overall memory performance. Further, it makes pointers to map cells stable wrt. rehashin. Currently, the following code is crash-prone:
int &i= v["hello"]; // returns reference to map cell w/ key "hello" v.set ("anything", 1); // might trigger rehash i= 5; // SIGSEGV (maybe)! since reference might have changed.
use __thread or <pthread.h> for a thread-safe version of global errno variables if ERWIN_THREAD_SAFE is requested (should work on Win/MSVC, Linux/gcc, Mac OS X).
added erwin_memmem and erwin_memcmp (if possible, simply the system function).
Optimised hash functions (faster and better distributing). See map.h for more details, especially for functions to hash memory areas and functions for combining hash values (e.g. for hashing structs).
The new hash functions are collisions free and are fast while still distributing well.
The new hash functions make the golden ratio in hash_into superfluous, because no additional distribution is necessary. In fact, the old hash_into was not collision free.
It is now replaced by a simpler function that multiplies hash*tablesize and uses the upper bits of the result. I will probably add the golden ratio function under a different name again later, because the implementations were quite sophisticated (and implemented in assembly for several architectures) and might still be useful.
The change of hash_into might affect people who have cast hashval_t to unsigned int (bad thing to do!) and then used hash_into, because on 64 bit machines, the cast erases the upper 32 bits, making hash_into hit slot 0 for small table sizes. If you cannot switch your project to a clean usage of hashval_t, you should instead:
#define Global_SIZEOF_HASHVAL_T 4
(Replacing Global_ with the library prefix, of course.) This switches the hashval_t to 32 bit width again, making it compatible with casting to unsigned int (on 32 bit machines).
If you use Global_REQUIRE_DETERMINISM, hashval_t is also forced to 32 bits in order to get the same hash results on 32 and 64 bit machines.
A lot of work was spent on improving 64-bit support, especially x86_64.
User hash functions are now easier to write well; we added a few macros. See Quick Manual and include/erwin/map.h, erwin_hash_state_t.
In order to repair a name clash for Global_oType_CMP and Global_oType_PRIORITY_CMP, which were needed in the header file because the macros Vector_DEFAULT_ARG_CMP and Vector_DEFAULT_ARG_PRIORITY_CMP depended on them, the two previously mentioned macros are removed and the corresponding macros replaced by Vector_DEFAULT_ARG.
This means that there are default arguments even if the cmp function is not defined. There is no easy way around this, because there are more complications: the files defaults.h and forwards.h should only be visible to the implementation, but they are needed by many _CMP definitions. This flaw was noticed very late, so some things might have been broken before and users included defaults.h themselves, which is clearly undesirable (nothing really wrong with including them, but the user should not be forced to do so).
Well, forwards.h might be a candidant for inclusion for the header files, too, to make the --include stuff unnecessary for recursive data structures. Later.
The macro Vector_t_TYPE_INFO_STD_MEMBERS was renamed to Vector_TYPE_INFO_STD_MEMBERS. E.g. for oType=int:
OLD: vector_int_t_TYPE_INFO_STD_MEMBERS NEW: VECTOR_INT_TYPE_INFO_STD_MEMBERS
The same goes for Map and List.
Please edit you files accordingly, although the old value is still recognised and used if #defined, in order to be downward compatible.
My Lisp utils are needed to run 'make options' (a developer thing; you will not need it), and I did not have any standard place for the few Lisp libraries I wrote, so I included them here. To install the (yet undocumented) Lisp tools, run 'make install-lisp'. You need CLisp to use the Lisp stuff.
Added function Vector_xchg(a,b) to quickly exchange the contents of two vectors. Probably will add the same for Map and List. Also, a macro ERWIN_XCHG was added.
Added Map_ensure_steal(). The functionality was only there using poke() otherwise, and Map_ensure_steal() is more efficient and elegant.
We'd also need Map_insert_steal() and Map_set_steal() by analogy now. We'll see.
erwin-cgen can now generate enum2string() functions for you. The C versions still have a broken name, though. These functions return the char const * representation for each enum value, so you can easily print them. (I never understood why C++ does not provide stuff like this automatically.)
added Debian package support
changed default of Vector_DEBUG_EXPENSIVE_CHECKS from 1 to 0
made range checking the default even for -DNDEBUG, because if the index of vectors is out of range, it is not an indication for a programming bug. And since it might be a maliciously triggered buffer overflow, we protect against this.
To switch the checks off, add -DALL_ERWIN_NO_RANGE_CHECK=1 to your compile flags.
Some vector options are now available universally, i.e., not only as global options for the Erwin library, but for all Erwin libraries to be compiled. The prefix for these macros is ALL_ERWIN_.
ALL_ERWIN_NOMEM_IS_FATAL
ALL_ERWIN_ASSERTION_FAILED_HANDLER
ALL_ERWIN_NO_RANGE_CHECK
ALL_ERWIN_LOW_MEM
ALL_ERWIN_INLINE_FUNCTIONS
these variables are not for setting in settings.h or in Makefiles, because those are for global options only, since included Erwin libraries' implementations cannot be changed by that file anymore. So these are likely to occur in universally visible shell variables or in compilation scripts that compile a whole bunch of libraries and bundled projects, like
export CPPFLAGS='-DNDEBUG -DALL_ERWIN_NOMEM_IS_FATAL=1'
Otherwise, the options will not to take effect in included Erwin libraries.
map and list also have the ALL_ERWIN_* variants now (if the option makes sense there).
With MANY_CASTS, pvectors failed to compile. It is quite unbelievable that this bug was not found earlier.
Improved hash functions
added stringify() family to char vectors
added prepend() to C++ interface of vectors
fixed mergesort to not cast size_t to int (since there is no ssize_t, we currently use long, which is better but still not perfect)
fixed and enhanced Vector::format() (some floating point stuff could crash, e.g. %*.*f and the like)
Added Vector_INLINE_STORE option. This makes vectors try to store elements where the pointer to the heap table would be. E.g. a VChar then stores 4 or 8 (depending on sizeof(char*)) bytes right inside the VChar box, before allocating a heap array.
This makes the implementation more complex and also sightly slower, of course, but may save quite some space for certain applications.
The current options can safely be enabled for the whole library. If the pointer size is smaller than the element size, then the default behaviour is used. An implemenation that always reserves a certain user-defined number of inline elements is underway.
inline element store for vectors: when the user sets the INLINE_STORE option for a vector, elements are tried to be stored where the pointer to the heap table is stored if possible. For many vectors that store elements with sizeof(oType) <= sizeof(pointer), 1..8 elements can be stored right inside the box and spare the memory for the heap table.
maps, when MINIMAL_SIZE is 0, are now slightly different: as before, they allow that not heap data is allocated at all if the map is empty. Additionally, if there are only few elements in the map, no bucket table is allocated, but a single list is used directly. This saves the memory for the bucket table for small maps.
untemplatize --erwin-prefix now hopefully works.
untemplatize now uses the invocation path as a first default to find the library. Only when this fails, the installation default path is used. This allows for easy movement of the library in the file system, and for easier installation in general in many typical situations.
Maps now handle oType=void, making them sets without an output value and consuming less memory.
Some functions have a different number of parameters, e.g. insert() lacks the value. Some functions are entirely missing in this mode, because it makes no sense to return values or pointers to values, or to modify the value in the map. The find() function returns 'Global_ERWIN_BOOL' (in C++, 'bool').
The basic functions for using sets would be with insert()/ensure(), find(), and erase(). Note that iterators are missing that return the value, so the simplest iterator would be Map_forall_keys().
Implementing symbol tables is now recommended with oType=void.
Note that pmaps currently cannot handle oType=void. This will be implemented soon.
Splitting into single files is not longer supported.
Removed old code.
Added vector iterators for 'oType const *' (e.g. forall_values_ptr_const). Still missing: the same for Maps and Lists.
removed unused warnings (instead of using -Wno-unused).
added a lot of gcc attributes (const, pure, malloc, nonnull, etc.)
rearranged the assembly stuff to have as few assembly functions as possible while still getting perfect or near perfect performance
added erwin_nonnegative(x) for warning free
assert (x >= 0);
even for unsigned types. Use;
assert (erwin_nonnegative(x));
Works perfectly for C++ and ok for C. This is used by assert.pl, too.
tried gcc 4.3 and removed more warnings
added more assembly for MSVC compilation
disabled makeproto script: splitting is now cancelled (it never really worked)
made QUOTE_HTML suitable for XML and SGML, too, and added constants for that
merge_sort is now thread-safe
internal switch to svn, thus version numbers are now different
Windows compilation now issues fewer warnings
updated config.guess, config.sub
added more tests and made the message that not supporting %Ld is non-compliant a warning that it a GNU extension is not supported by clib (for Mac OS X clib, which, BTW, is also not POSIX compliant in printf (in "%#.0o", 0))
type-punned pointer issues fixed (new gcc)
added newdelete.h for REF/UNREF, etc. macros
fixed bad bug in Map_intersect (bad operation: did not delete everything, and SIGSEGV). Added tests to maptest2. (Thank to Ingmar Stein for reporting)
added _eref storage class to slots (erwin-cgen): this is an inline accessed by reference (instead of via pointer as in _ref).
fixed bug in erwin-cgen that lead to lines being dropped from the comments in HTML output.
for x86_64, use attribute(always_inline) for Map_poke_internal when optimising for speed, since all functions use this and without it, the code is 50% of the speed of the 32 bit versions at lower clock speeds.
add assembly support for x86_64 (erwin_hash_into)
added erwin_strnto* family that takes a string and a string length instead of a string only.
added erwin_strnlen
added set-prefix to erwin-cgen
extended erwin-cgen to parse top-level operator declarations in C++ (those outside class definitions)
- added erase_if
- serious bug in vector.cd: 'unsigned ERWIN_LONG_LONG' is bad and should be 'ERWIN_UNSIGNED_LONG_LONG', since on systems with a 'typedef long long int64_t', the former does not work.
strto* functions are now NULL safe and return 0.
bug fix: DYN_ZERO define was not copied into pmap, pvector, plist. This bug must be very, very old. (Thanks to Michael Schmidt, who found this.)
classdef.h: SET_ID not defined empty if not defined before. This then makes set_id() do nothing.
- move installed helper binaries from ${libdir}/erwin/bin to ${libexecdir}/erwin to be more compliant with the file system standard.
If you upgrade, you may type 'make uninstall' before 'make install', to get rid of the files under ${libdir}/erwin/bin.
some more declarations are generated now for C. E.g., the file created by --write-announce contains renaming typedefs for simple, enum and union types, too, not only for classes.
Further, the file generated by --write-forwards includes renaming #defines for basic C types, something like:
#define crl_symbol_t CrlSymbol
If you have manual typedefs of this kind, you might encounter compilation problems that can easily be solved by deleting your manual declarations.
the files classdef.h and classundef.h for programming C++ with properties and full erwin-cgen support have been added to the distribution. There are also included in the generated erwin subdirectories. The files need documentation to be useful, however.
the markup in HTML was improved a bit: embedded code looks better now.
the -D option is allowed with any value, not only booleans, to be more similar to gcc compiler switches.
several bug fixes concerning the global prefix
new command line options: --hot-char, --c2html, --creator-comment, --quiet, --debug
doc extractor allows layout template files (currently very simple)
classdef.h and classundef.h are now included in the Erwin distro for easy C++ programming with slots/properties.
added documentation for erwin-cgen. Please note that the script is still in heavy flux, so in contrast to most of the rest of the library, some incompatible changes will likely be done in the future.
fixed the makeerwin script to work with 'which' under NetBSD. We might better use 'type' instead, as it seems to be more portable.
#ifdef defined(X) -> #ifdef X in defs.h
- the _errno bug in _copy was also fixed in List.
adds --extra-options-gcc: ... to the output file
reading, writing, modifying C interface maps and checking the current interface against them.
read file contents as options
- on-the fly creating of values improved: It is not possible to store pointers to other maps or vectors as values in a map, while still having a non-pointer result from operator[].
This provides maximal efficiency: together with oType_ENSURE_VALUE you can generate the maps.
Old:
*(a[5]->find_ensure_ptr(6))= 7;
or
*a[5]->set (6,7)
New:
a[5][6]= 7;
To use this, use an additional oTypeIndex:
Old:
untemplatize iType=int oType='map_int_int *' \ -DoType_ENSURE_VALUE(X)='map_int_int_new()' \ -DoType_OFREE(X)='map_int_int_delete(X)'
New:
untemplatize iType=int oType='map_int_int *' \ oTypeIndex='map_int_int' \ -DoType_ENSURE_VALUE(X)='map_int_int_new()' \ -DoType_OFREE(X)='map_int_int_delete(X)'
in vectors, the operator+(int) was not overloaded in a const version so a[5] would not work if a was 'VectorChar const &'.
The same 'const' overloading was added to nth_ptr, nth_ref, nth_ptr_char, nth_ref_char, nth_ptr_check, nth_ref_check. It is currently missing for first_ptr, first_ref, last_ptr, last_ref (and it's also missing in C). This overloading is not present with operator[](int) const, since that returns no reference at all.
Unfortunately, one of the prototypes did not work under Microsoft C++, since the overloading does not work properly. I had to disable:
oType const *Vector::operator+(int) const
(Note that the non-const version is enabled and works.)
In order to keep the interface identical under Linux and Windows, that function was disabled independently of the OS. The same functionality is provided by:
oType const *Vector::nth_ptr_check(int) const
or
oType const *Vector::nth_ptr_char(int) const
(depending on type)
If you want it, because you don't want to program under MS, use:
#define ERWIN_IGNORE_BROKEN_MS_COMPILER 1
unbelievable this has never been found: Vector_copy and Map_copy had bugs: Vector_copy did not set e_errno in case of NULL, and Map_copy did not allow NULL and raised an assertion failure(!)
It's fixed now: both allow NULL and set (vector/map)_errno to (VECTOR/MAP)_OK in that case.
- Freshmeat release
improved documentation
added --makeerwin to untemplatize
fixed HTML quotation to always use decimal notation due to older browsers being unable to parse 0 but only 0
added better hashing with additional configurability via Global_ERWIN_HASH_STRENGTH
added Vector_swap_erase() and used it in Vector_heap_delete().
added Vector_first_swap_chop1 for convenience (maybe we want Vector_first_chop1, too, but this involves heavy copying, of course).
removed Vector_last_chop1_flags: do this manually if you need it. The dealloc arg is futile anyway, because it is always 0.
fixed newly introduced bug in Vector_last_chop1: don't delete the return element...
added --xxinclude to untemplatize
map iterators have the _reverse versions, too. In total, we now have 77 different map iterators...
added Vector_reverse() (C and C++)
added Vector_last_chop1() in C
fixed a bug: last_chop1() returned oTypeResult, but should've returned oTypeVar, since the reference might be invalid after reallocation. That this one never SIGSEGVed it quite amazing!
implemented a bunch of heap functions implementing priority queues in vectors: Vector_make_heap (often called 'build_heap') Vector_heap_left Vector_heap_right Vector_heap_father Vector_heap_raise Vector_heap_sink (often called 'heapify') Vector_heap_fix Vector_heap_insert Vector_heap_delete Vector_heap_extract (often called 'extract_max/min') Vector_heap_sort
Apart from heap_sort, there use the new comparison function Global_oType_PRIORITY_CMP as their default comparison.
Also, there is a new recursive comparison: Vector_priority_cmp
However, there is not yet the same automatic deep comparision macro definition as for OCMP.
added checks for sortedness and heap property to sort and heap functions
runtime type information (real information, not only the usuall C++ stuff) for all types.
bug fix in maps (thanks to Frank Fontaine): Global_ was missing in map_u.h in the ENSURE_VALUE #define orgy.
From the targzs, I reconstructed the following for 2.0.267 to 2.0.269:
assert is better suited for Kernel programming now
string_equ, string_case_equ, string_n_equ, string_n_case_equ added.
gcc 3.0 extensions to support __builtin_expect. Available through ERWIN_EXPECT and ERWIN_LIKELY: if (ERWIN_EXPECT (x > 5 , 1)) { ... }
if (ERWIN_LIKELY (x > 5)) { ... }
basic Watcom compiler support
same functions were unnecessarily recursive in Map.cd and have been converted to iteration in order to trigger off less problems under Windows and Mac OS X which hav non-dynamic stack sizes.
ALLOW_OUT_OF_RANGE behaves more consistently now for vectors. There were also bugs which returned a pointer to after the vector for nth_char, I think.
No idea -- cannot easily be reconstructed since the CVS was lost between 2.0.264 and 2.0.269.
- made quotation work for %c, too. I hope this does not break things.
Multibyte characters are quoted in their multibyte form (so that the multibyte conversion cannot break the quotation). I think that is what people expect.
made Vector::format() more standard compliant. It passes the 5000+ (mostly trivial, however) tests of glibc 2.3.2. The vectortest2 found a bug in my libc, btw.: in sprintf, %hhd / %hd are broken while %hhi / %hi are not. Strange. I could not find this in the source code of glibc 2.3.2, so I suppose it's fixed by now.
Features still missing in format(): $ - position, n - write the position. (The latter feature is somewhat available with the info-structures, my own extension).
GNU extensions I included (hoping it will not get into conflict with anything 'official'): ' modifier (only recognised, not locale dependent), %m strerror(errno) (works for Windows, too by using ERWIN_STRERRNO macro)
for reference counting, I added a new type distinction to all classes: iTypeTouched (and oTypeTouched). These are function parameters that are eventually 'copied' (with iType_ICOPY or oType_OCOPY) or 'freed'. I put these in quotes here, since the copying it is needed for is ref() and unref(). The interesting thing is that modifiy() and set() are have different prototypes, since modify never ref()s its input, while set() does in case the element is new.
(Internally, the poke_internal-function how has a prototype not fully suitable for implementing modify. Maybe this should be fixed)
made the Erwin string wrappers have names known from DOS and Unix, e.g. erwin_strcasecmp and erwin_stricmp as an alias for string_case_cmp, which no-one seemed to be able to remember. :-)
During compat removal: again unified the C and C++ types: using a #define for the C++ type is not very good, since there were problems with #include orders. Instead, I changed it to a typedef, with the effect that the C++ structure must be implemented with the name Vector_t instead of Vector_class. I did not find a way way to get all the following features
have three aliases for the same type: struct vector_char_t vector_char_t VectorChar
implement the C++ class using the name VectorChar
have no #define
I judged 3 to be more important than 2, so the structure of the forward declarations now is:
struct vector_char_t; typedef struct vector_char_t vector_char_t; typedef vector_char_t VectorChar;
Before this, I had:
struct vector_char_t; #define VectorChar vector_char_t typedef struct vector_char_t VectorChar;
And yet before that, vector_char_t and VectorChar were distinct types. And yet before that, there was struct _vector_char_t.
removed COMPATs. This was a hard decision, but the effort of maintaining them was high. Plus, I never tested them anyway, so to reduce my amount of programming, I removed them.
started to add Watcom compiler support (thanks to Michael Schmidt)
a new data type: doubly linked list
fixed a problem with autoheader > 2.53: remove the PACKAGE_ constants from the generated config.h.in.
The C type, e.g., crl_..._t and the C++ type, Crl..., are exactly the same type now. This improves interoperability between C and C++
fixed a version number mismatch in this file
string_string, _n_string, _case_string, _n_case_string and aliases erwin_strstr, strnstr, strcasestr,strncasestr, stristr, strnistr
added cn() to Map
fixed Vector_cmp to be NULL-safe
implemented Map_cmp_keys, Map_equal_keys
fixed a memory leak in Vector_setsize
added Vector_erase_equals
added string_n_cmp, string_n_case_cmp, string_n_dup
added support for using DOS functions stricmp and strnicmp in the implementation of the wrappers string_n_cmp and string_n_case_cmp.
added macros to make available standard libc-like names: erwin_strlen, erwin_strdup, erwin_isalpha, etc. These macros include both the Linux and DOS names, i.e., erwin_stricmp and erwin_strcasecmp are available.
added format options: FO_QUOTE_HTML, FO_QUOTE_URL, FO_QUOTE_IN_LISP_STRING.
for compatibility with libc functions, changed the return value of string_length to size_t.
made docu compile again
added assembly code for PowerPC
added assembly code for MSVC++
fixed format code to not access behind strings. Not even one characters, so that it is possible to write
char c; .... format (FO_QUOTE_IN_C_STRING, "'%.1s", &c);
with an at address &c, but no access at &c+1. This also even works for quotation, etc.
The above does not work because the output string is counted instead of the input string, so '\n' is printed as '' instead. We will probably need a flag for that.
- fixed severe bug in operator=
oType_OCOPY and iType_ICOPY are tested for not setting the error flag when no error occured.
bug in find_ptr_ensure removed (this function did not rehash)
_nthptr renamed to _nth_ptr. There is a compatibility defined to have _nthptr as well (Global_ERWIN_COMPAT_2_0_260).
added first_ptr and last_ptr functions for vectors.
re-implemented FO_QUOTE_LISP_STRING
(definitely in 2.0.260 -> 2.0.261):
all _HASHVAL things renamed to _HASH_RAW. Same with _hashval/_hash_raw. The _HASHVAL defines still work due to frequent usage.
new functions for maps: Map_insert_map ;; like set union (modified wins) Map_set_map ;; like set union (modifier wins) Map_erase_map ;; like set substraction Map_modify_map Map_intersect_map ;; set intersection (modified wins) And according changes to C++ type.
C++ iterators finally work for pointers and references: MapIntInt a; int key, value;
map_forall (&a, key, value) { ...} works map_forall (a, key, value) { ...} works, too - Iterators support iterating over the data structures with a pointer to the values. This is implemented for maps and vectors.
Added LESS_{CPPFLAGS,CXXFLAGS,CFLAGS,LDFLAGS} according to the MORE_ variants.
New macros: map_forall_ptr, map_forall_values_ptr, map_forall_pairs_ptr ... (And a lot of them with sorted_by_blahblah, too.)
Adjusted the order or MORE_... variables. I hope this does not break anything. If so: the current order is most sensible.
The order of objects in the library is determined now in order to have objects initialised in the correct order. (untemplatize reversely sorts the names of the objects by their string length).
Cyclic dependency removed from map.o and base.o by introducing init.o. This will allow the use of lorder and tsort to manage linking order.
Removed the automatic invocation of Global_erwin_init from map.cd because it makes the object dependency graph cyclic. Instead, a fatal error is generated.
- Added Map_erase_if.
?? FIXME: find out what has changed - Bug fixed in Vector_to_lower which did the same as Vector_to_upper...
- Bug fixed in vector_delete (thanks to nico!): NULL is always a valid input to _delete due to calling conventions of _delete handlers for all data types.
Some more docu (M4 parts: autoconf (partially), Makefile generation)
Matthias found bugs in top-level Makefile.in and doc/makedoc.pl. Fixed.
Licence changed (source code must always be left available).
Two serious bugs fixed in format-family. (Why has no-one triggered them before??)
Documentation TODO list started
Documentation!
MSchmidt found a memory leak in poke_no_(i|o)copy family.
Again, noted inconsistency in naming of Vector_insert_subvector, Vector_overwrite and Vector_set functions: _insert_subvector is unique to the insert family and quite similar to _overwrite, but has one additional argument. This is very ugly. Furthermore, _set and _overwrite are basically the same so the whole overwrite family should rather be called _set (_set_raw, _set_vector, _set_string, ...). However, it sounds strange, so maybe _set should be renamed to _overwrite. But that would break all programs written so far. And is ugly, too. No solution yet, I will have to think about it.
Added 'explicit' to most unary constructors. This might cause problems if you relied on that feature. There is no compatibility define. Add the necessary casts to your program. (FIXME: add a compatibility define).
CANCELLED BECAUSE OF INVISIBLE INCOMPATIBILITIES CANCELLED: Added an 'operator int()' to maps and vectors. Together with CANCELLED: the added 'explicit', this changes the behaviour of e.g. the CANCELLED: operator == on a map: CANCELLED: CANCALLED: Let m be a map. CANCELLED: Before: m == MAP_OK => m == Map(1) This is a horrible misinterpretation! CANCELLED: After: m == MAP_OK => m.get_errno() == MAP_OK
Started to write acerwin.m4 which will become a replacement for configure.def and configure.erwin. The two will vanish in version 2.1.x of Erwin, but for now, they will be frozen, installed, etc, but have a feature freeze.
Changed --with-cpp to --with-cxx to reduce confusion. This is an incompatible change, however, but the old option had a bad name.
Renamings: BOOL -> Global_ERWIN_BOOL TRUE -> Global_ERWIN_TRUE FALSE -> Global_ERWIN_FALSE
There is a compatibility #define to get back to old behaviour:
#define Global_ERWIN_COMPAT_2_0_249
The --xinclude is put at the end of the basic include hierarchie, namely to the end of erwin/base.h. So --init --xinclude=... is suitable for declaring types needed by the data structures (e.g. typedef char const *symbol_t) with all the configuration data already available.
-D..., -U... If the names you define or undefine start with _, iType_ or oType_, this will be replaced by the name of the data structure (with the replacements made by --name, see below), by the iType or the oType. Note that no substitutions are visible to that part of untemplatize. This might need to be fixed some time.
--name=.... renames the data structures. This is a bit more than replacing the names by standard substitutions. Additionally, it also effects the substitution of the -D and -U names.
These options simplify e.g. the declaration of a symbol table:
> untemplatize map \ iType='char const ' oType='int' iTypeVar='char ' \ -DiType_ICOPY=string_dup \ -DiType_IFREE=string_free \ -DiType_CMP=string_cmp \ -DiType_EQUAL=string_hash \ --name=symtab
If you have a prefix,
> untemplatize --init --global-prefix=special_
this is especially handy:
> untemplatize vector oType=int -D_ALLOW_OUTOFRANGE=1 --name=my_v_int
This is like:
> untemplatize vector oType=int \ -DMY_V_INT_ALLOW_OUTOFRANGE=1 \ special_vector_int=my_v_int \ SPECIAL_VECTOR_INT=MY_V_INT
The types are defined differently:
typedef struct _X_t X_t -> typedef struct X_t X_t class X -> struct X
(The latter in order to enable usage of class pointers in C).
There is a compatibility define to change this back to old behaviour:
#define Global_ERWIN_COMPAT_2_0_248
An additional include file exists: erwin/forwards.h. This declares all types defined by the Erwin library without the need of the basic header files. All Erwin data structures include this file, so you can use at least all the pointers of Erwin structures in other Erwin structures without having to use --include=.... with untemplatize.
Map::errno renamed to Map::get_errno because <errno.h>::errno may be a macro. Unfornately, this will break compatibility.
Changed maps to use operator new/delete/delete[] when compiling C++. This is needed for boxed storage. It also creates heavy problems with vectors that rely on realloc. Obviously, no C++ designer thought about this thoroughly... The design decision to have to realloc in C++ is not very good.
Added possibility to define MORE_*FLAGS during call to untemplatize -init.
Vector: Added default value for oType_HASH if oType_HASHVAL is defined.
HASH and CMP are always defined for the C++ class now.
oTypeResult and oTypeParam implemented.
Changed defaults for 'constant zero'. All pre-defined pointer types now have the constant zero element NULL.
Vector_insert_subvector got another argument. This will break compatibility. However, the function was quite new and only Michael has used it yet, so there is no compatibility #define.
Added documentation
Vectors are quiet now for out of range cases. You can make them verbose again by defining ERWIN_VERBOSE.
Changed ERWIN_INLINING to ERWIN_INLINE_FUNCTIONS and used it for vector and map implementation as well. FIXME: This does not work yet.
If you want errors not to be printed to stderr, you can define ERWIN_ERROR_STREAM (which defaults to stderr) and ERWIN_ERROR_PRINT (which defaults to fprintf). The calling conventions are the same as for fprintf (thus suitable for Vector_format).
Added support for Debian package creation by an additional INSTALL_PREFIX
Added support for threads. This made some changes necessary. Most are only visible when compiling for thread safe operation. But two functions now have an additional argument:
Vector_set_quotation_method Vector_get_quotation_method.
There is no compatibility #define because the functions are likely not to be used a lot.
All remaining Makefiles became Makefile.in. Some Makefile.in became Makefile.m4 (currently only one: templates/Makefile.m4). GNUMake features were removed.
The configure script automatically generates Makefile.in if they do not exist. Added makemake.sh for developers to re-generate Makefiles.
Restructured Makefiles to be suitable for developers. Doc and templates are automatically generated when CVS directory exists.
Common m4 macros were moved to common.m4 (used by doc extractor and by Makefile.m4)
Improvements of map iterators:
C++: map_forall*_nondet added to force non-deterministic iteration (if user knows it does not matter, this may be used)
C: Map_forall* macros renamed to Map_forall*_nondet to force compile time errors of not compiled with REQUIRE_DETERMINISM. If no determinism is required, the normal macro names are provided.
Changed hashval and equal to use the _nondet versions because order does not matter (they are programmed in such a way that correctness of result does not depend on order).
Other changes:
_erase and _set return int because I found that MAP_ERR_NOMEM can occur...
Added annotations of error codes to every map function and checked that on error, the 0 value is returned, not something else.
Added missing out-of-memory handling code in Map_get_* family.
Reformation of Makefiles and version management. Development versions are now marked with -preN, where N is automatically incremented for every trial of compilation. The only way to get rid of -pre suffix is to do
make release
Which also automatically commits the current version, tags it with an appropritate version number, and makes a .tar.gz file.
Added cvstargz.sh to the distribution.
Moved check_if.pl from contrib to bin.
Added Makefile targets: count and list-files.
Reformation of C++ vector iterators. In 2.0.241 so many nice map iterators were introduced that I wanted some of them for vectors as well.
Completed support for determinism by fixing Map_get_(entries|keys|values).
Renamings: Map_delete_flat -> Map_delete_flags Map_clear_flat -> Map_clear_flags Map_destroy_flat -> Map_destroy_flags
Done for consistency with vectors. There is a compatibility #define.
ERWIN_TMALLOC is used in addition to ERWIN_TCALLOC. The library warns if you pre-defined ERWIN_TCALLOC but not ERWIN_TMALLOC (and if you have gcc, because otherwise, there is #warning) so you can add a #define for ERWIN_TMALLOC as well. In this case, ERWIN_TCALLOC is also used for ERWIN_TMALLOC.
The compatibility #define switches back to use if ERWIN_TCALLOC.
Renaming: ERWIN_RANDOM -> Global_ERWIN_RANDOM (No compatibility define. If you developed libraries with Erwin2, rename your function)
Added support for deterministic data structures: Global_ERWIN_REQUIRE_DETERMISM
Added many forall-macros. Most of them are generated with a Perl script because their definitions are very similar and there are many of them. The forall-reformation was done because of determinism issues that now play an important role.
Features for vectors:
Vector_NO_AUTO_SHRINK : added Vector_ALLOW_OUTOFRANGE : completed Vector_MINIMAL_SIZE == 0 : special case optimised and documented
Hopefully these work now.
Incompatibilities: Vector_subvector has an additional argument now. This has to be set to Global_ERWIN_TRUE to get old behaviour.
Renamed Vector_NULL_ALLOWED to Vector_ALLOW_NULL. This is more consistent with Vector_ALLOW_OUTOFRANGE.
A hell of a configure skript we have now...
Map_mean_line_length was renamed to Map_deviation_line_length.
Changes (char const) parameters back to (char) in base.h. I hate this VC++6.0. The new prototypes are nicer now. Compiling under Windows, things might break now.
Serious bug found in vector function that erase elements. When specifying that the vector should not be re-allocated (resize == Global_ERWIN_FALSE), the vector would not even adjust its size. This is fixed.
If you relied on this bug, your code will unfortunately break. No compatibility #define was introduced.
Serious bug in vector::fread which was completely broken and caused SIGSEGV.
cut renamed to erase
cut_if added
vector::nthptr returns a pointer. Not a reference. This will create incompatibilities but the name is misleading.
BTW: There was a bug in vector_u.h (.hpp) which will lead to compiler errors because vector_u.h(pp) is not included. To change: either delete vector_u.hpp and re-untemplatize or edit and change #ifndef VECTOR_H to #ifndef VECTOR_U_H
Hash values can be calculated from maps and vectors. To use them, you have to supply hash functions for oTypes of vectors and maps.
untemplatize should be invoked like this for a symbol table: untemplatize map \ iType='char const ' iTypeVar='char ' \ oType='int' \ -DCHAR_CONST_P_ICOPY=string_dup \ -DCHAR_CONST_P_IFREE=string_free \ -DCHAR_CONST_P_HASH=hash_string \ -DCHAR_CONST_P_CMP=string_cmp
There will be an abbreviation for this. It will probably be called `untemplatize symbol_table'.
Suppose you declare the following in C++: MapCharConstPInt string2symbol; Then you can write string2symbol("hello") to call string2symbol.ensure("hello").
The advantage of this symbol table is that you can store additional data for each symbol in the same hash table (the oType does not matter for this to work) by mixing the usage of _ensure(), _insert() and _set().
However, `untemplatize symbol' will be supported for back-ward compatibility reasons. It does not hurt at all to have it.
Speed up of untemplatize by a jump table.
Restructuring of Makefiles, Documentation subdirectory merged from two development versions.
-DERWINMM_COMPAT1 is untested for at least a hundred subversions now. It is not at all guaranteed to work anymore.
A settings file is written into the Erwin subdirectory. From here, some important settings that affect the -init command and the normal untemplatisation commands are read. These settings are:
-I= --global-prefix= --subdir= --cpp-support=
The default has not changed. It is still:
-I= --global-prefix= --subdir=src,include,.. --cpp-support=1 # depending on configuration of package.
These should now be defined during the -init phase and not during untemplatisations. untemplatize will read the settings from that settings file.
Futhermore, -init will not be done automatically by default. You really should do it as a separate initial step now.
The settings of --erwin-lib is NOT stored in the settings file because the global installation directory might change although it is still the same version. Incompatibilities will be warned about via the version file of the installation and the local copy.
--cpp now implies --cpp-only. To force old behaviour, use
-cpp=... -cpp-only=0
(Replace ... by your favorite C++ file extensions. The default is -cpp=.cpp,.hpp)
Very much has changed although Erwin should be compatible with old versions in most aspects.
Prefixes are supported for allmost all identifiers of Erwin. This includes macro names, functions, classes, types and the name of the library itself.
The name of the include/erwin directory can now be changed still keeping the library compilable.
A few things were renamed. All of these are not prefixable since they only depend on the compiler, not the user:
old new ERWININLINE ERWIN_INLINE ERWINLONGLONG ERWIN_LONG_LONG ERWINUNSIGNEDLONGLONG ERWIN_UNSIGNED_LONG_LONG BLOCK_BEGIN ERWIN_BLOCK_BEGIN BLOCK_END ERWIN_BLOCK_END E_RANDOM ERWIN_RANDOM PROFILE ERWIN_PROFILE
An the prefixable ones: PRINT_iType iType_PRINT PRINT_oType oType_PRINT
Additional macros were behaviour changed due to fixes of cross compilation: interim name name now ERWINDOS ERWIN_DOS ERWINMSVC ERWIN_MSVC ERWINCROSS ERWIN_CROSS <none> ERWIN_NEW_ASM_SYNTAX (gcc only)
Addtional macros: ERWIN_GENSYM(X) generates the symbol (X_LINE_). This is useful for local variables of for() loops in macros on broken compilers.
Naming conventions:
The new changes had to be done to almost every file of the Erwin library. The identifier names in template files and Erwin source files are even more different than before so the following sections should clarify the usage now.
The are three kinds of exported symbols:
Symbols shared by all versions of Erwin. These are derived from the compiler settings only and are considered not to be subject to changes except for bug fixes. These symbols will not change their name when the --global-prefix option is used with untemplatize. Only macro names are gathered in this category.
Some examples are: ERWIN_INLINE ERWIN_LONG_LONG etc.
The definitions of these symbols can be found in defs.h, which does not not contain any prefixable identifiers.
Symbols which are shared by all data structures of Erwin. These symbols usually start with ERWIN_, MAP_, map_, VECTOR_, vector_ or the like (uniform capitalisation of letters).
These symbols are prefixable and must therefore be prefixed with Global_ in the sources. Note that `Global_' is replaced by the user prefix without regard of the context, thus even in something like `MYGlobal_var'.
With an unset --global-prefix: library source local copy Global_MAP_OK MAP_OK Global_vector_errno vector_errno Global_map_iterator_t map_iterator_t
With --global-prefix=abc_ library source local copy Global_MAP_OK ABC_MAP_OK Global_vector_errno abc_vector_errno Global_map_iterator_t abc_map_iterator_t
Note that the library name also changes with the global prefix. All underscores will be deleted there. In the above example this would result in a library file called libabcerwin.a.
Symbols belonging to an instatiated data structure. In the template files, these begin with Vector_, Map_ or the like (mixed lower and upper case). Because the prefixes like `Vector_' are replaced anyway, the --global-prefix is also automatically replaced.
The class names belong to this group as well.
For a map from int to char*:
With an unset --global-prefix: library source local copy Map_new map_int_charp_new Map_class MapIntCharpNew
With --global-prefix=abc_ library source local copy Map_new abc_map_int_charp_new Map_class AbcMapIntCharpNew
Changes to fix cross compilation with djgpp (tested: Linux -> DOS32)
Added nmakefile support. This was quiet complex. At the same time, no GNUMake should be required anymore.
Started to make file names 8.3 conform. (asm_gen.h, confdos.h).
Managed to make library compile with MS VC++ under NT.
MAP_OK and VECTOR_OK were changed to 1 and all others to different values. This will create problems if you tested for 0, but I want compile time errors when the result of e.g. Vector::insert is compared to VECTOR_OK.
Additional macros are introduced: X_IS_WARNING(X), X_IS_ERROR(X), X_IS_OK(X) for X \in \{ MAP, VECTOR \}
FEATURES: Vectors now accept NULL as an empty string and as a table if the table size is 0 when inserting and constructing.
BUGS: Something like the following linker error occured in objects which had static vector or map declarations. I don't know why. I think it's a compiler bug. However, this means the linker tricks usually don't work. Until I come up with a good idea of how to make this work and protect the user from problems concerning #define, I will sadly disable the link trick check for compile time settings.
shelf.o: In function `global constructors keyed to vector_char_constant_zero_element_expected': /home/ht/project/c/X/xcdshelf/shelf.cpp:42: multiple definition of `global constructors keyed to vector_char_constant_zero_element_expected' M_mergedia.o:/home/ht/project/c/X/xcdshelf/M_mergedia.cpp:71: first defined here ld: Warning: size of symbol `GLOBAL.I.vector_char_constant_zero_element_expected' changed from 23 to 18 in shelf.o make: * [xcdshelf] Error 1
PS: I tried to use extern inline instead of _attribute_((weak)) but then gcc does not generate any definition for the functions in question, again causing linker errors.
Renamed oType &Vector_class::nth() to nthptr because that stupid, silly, brain(!)dead compiler unnecessarily chose this one over the nth() const one, ending up in a SIGSEGV when automatically deferencing the result. Who programmed that stupid choice into the compiler? (Ok, perhaps that's the standard choice, but I don't want that).
This can cause incompatibilities with old versions.
Because the linker tricks to prevent inconsistent compilation of the library and your application sometimes might cause problems, I included a #define to prevent those games. Simply globally
#define ERWIN_NO_LINKER_TRICKS
if you know for sure that you compile the library and your application with the same #define settings and you still get very strange linker errors and warnings.
With NDEBUG, the linker tricks aren't played anyway. The same holds for compilers other than gcc.
parray works again. There were serious bugs in pmap.ht which I oversaw, which was the reason for the compilation failures.
How to use maps instead of arrays.
After porting some programs by a quick hack from pre-2.0.173 to 2.0.176, this is a report of what to do.
If you have lines in your Makefile which generate links from erwin/include/vector_u.h to ./vector_u.h to keep them in the application directory and make it possible to simple rm -Rf the whole erwin sub-directory, then remove these links. The default place for _u.h(pp) files is now in the application directory.
If you don't mind warnings, you can still use `array' instead of the new type `map'. untemplatize will enter a compatibility mode.
If you use `parray' it is not easy. For a quick hack, I used `array' to have the same naming but multiple implementations. There is not easy way to implement a compatibility mode, otherwise I would have done. (It does work in in 2.0.277 above!)
Delete the Erwin-directory completely (after ensuring you did not change any _u.h files there). If you changed _u files there, put them in the directory just above the erwin sub-directory.
untemplatize all data structures.
Re-configure, re-make.
This worked for quite large projects for me and was not too much work.
Changed #ifndefs and #ifdefs to #ifs. This makes it possible to globally define a property to be true and locally undefine it (#define to 0). This was not possible before. If you set properties like this:
#define VECTOR_CHAR_NO_RANGE_CHECK
you will now get compile time errors. Change these to
#define VECTOR_CHAR_NO_RANGE_CHECK 1 .
This does not affect the global values for which an #define is ok:
#define VECTOR_CONSTANT_ZERO
is therefore ok. However, for consistency reasons, you should change this to
#define VECTOR_CONSTANT_ZERO 1
Arrays were renamed to Maps. You should change your applications as well. Alternatively, you can use the following untemplatisation command to generate the old naming.
Instead of untemplatize array ..
write
untemplatize map map=array MAP=ARRAY ...
An you will hopefully get the old names.
If you simply type `make install' now on a system which already has an old Erwin2 library installed, the (old) array implementation will be kept and not be overwritten or deleted.
However, you should think about removing it soon as it does not fit the current version of Erwin.
The default sub-directory for user definition files is now `..'. This means instead of writing to erwin/include, untemplatize will put them into the current directory. This makes many ugly hacks with links unnecessary. If you want the old behaviour, invoke untemplatize with the following option:
--subdir=src,include,include
This puts them back in the right place again.
There is an additional include file now, belonging to the implementation alone. It is called Array_d.h (for `definitions'). It was included by splitting Array_u.h into two files. One only containing comments (new _u.h file), the other (_d.h) the default definitions.
return values adjusted for some functions to make them consistent with convention in C++. You may get compile errors because of this. See the last change of vectors for explanation.
NPROFILE changed to !PROFILE, i.e., all #ifndef NPROFILE were changed to #ifdef PROFILE. If you use profiling, this means you need to define PROFILE somewere. If you don't it means your applications will be faster and use less memory. Profiling is a developper's feature. So the default should be off.
PS: When reading the implementation of arrays, I was astonished of how small it is. Hashing is by far more complicated than resized arrays. However, vector has grown a lot in the past and has provided many, many convenience functions which are unsuitable for arrays.
PPS: Array::erase is still missing. This is a shame. A very embarassing fact to notice.
locate() is new.
C++ member functions now return (Vector_class &) instead of int if the old result value was the error status. This may cause compile errors.
Before, you might have written:
if (v.set (index, 5) != 0) { ... }
Now, you must write:
v.set (index, 5); if (vector_errno != 0) { ... }
On the other hand, you can now write: int new_length= v.set (index, 5).append (6).string_length();
make_gap_with() had inconsitent argument order which was fixed.
bfind changed the order of parameters. This will cause compile errors. Simply swap the second and third parameter. This was done to allow for a NULL default value in C++ for the comparison function.
Because untemplatize fixes the Makefile.in and Makefiles depending on the C++ feature, you should now always use untemplatize -init -cpp... when you intend to mix C and C++ usage. This is because if you have untemplatize initialise the directory automatically when not using the -cpp option, your later untemplatized C++ files will not be compiled because there is no rule in the Makefile.
This was an important change that was necessary to make the Makefile with when .cpp was NOT the C++ file ending.
a) Vector_format was redesigned to use ANSI conformant format strings. This was done in order to enable the format string warning feature of gcc. Unfortunately, you might need to fix your software for this to work. Earlier, you wrote (C++ style):
v.format ("%:s", a)
Now you do: v.format (FO_QUOTE_C_STRING, "%s", a)
Because the options are valid for the whole format string, you might need to split formats like "%:s %s" or something.
Note that some features where temporarily removed because the interface would be very complicated. E.g. the loops and if-statements in format strings are not valid anymore. See templates/vector.hd for details.
b) The configure call has been enhanced to support subdir calls. For this, an additional file file `configure.erwin' will be generated which you can paste into your toplevel configure file to recurse into the erwin directory. There are mechanisms to pass additional options (see configure.erwin for details).
c) There is support for library use of Erwin now. The make target `objects' generates the .o files and then a file libobjs containing the names of the .o files you must add to your ar call. See README for details. Another details is that erwin_init() may now safely be called several times. By this you can call erwin_init from your library's initialisation function.
New data types `pvector' and `parray'.
These provide exactly the same functionality as `vector' and `array'. But instead of implementing the functions, wrappers around the functions of another vector or array are generated. There will only a header file, not an implementation file. All functions will be declared inline and will only be one-liners.
The reason for this: Suppose you want to have many array from `Symbol' to some pointer type but do not want to have implementations of arrays for every pointer type since the compiler will generate the same code anyway because all pointer types are handled in exactly the same way. You could therefore come up with the idea of having one array form Symbol to char* (don't use void* because the implementation uses pointers to oType and void** is not portable!) and use this for all your other pointer types using casts. This method prevents you from enjoying the type mismatch error messages the compiler can generate. Therefore, the necessary casts should be capsulated by generating inline functions. This is what pvector and parray do.
E.g.: You have one array for char*:
untemplatize array iType='Symbol' oType='char*'
For the pointer type `int*', `FILE*' and `SomeData*' you can generate wrapper implementations around this. By piType and poType you specify which will be the underlying implementation.
untemplatize parray iType='Symbol' oType='int*' ipType='Symbol' opType='char*' untemplatize parray iType='Symbol' oType='FILE*' ipType='Symbol' opType='char*' untemplatize parray iType='Symbol' oType='SomeData*' ipType='Symbol' opType='char*'
The interface is exactly the same, you should not be able to feel the difference. Of course, that the derived implementations inherit all settings like hash and comparison functions.
If you have a good compiler, not a single byte of additional code will be generated for these implementations in most cases.
(Note to developers: The files parray.hd and pvector.hd are automatically generated from array.hd, vector.hd, parray.ht, and pvector.ht by the Perl script makewrapper.pl. For this to work, you must not do ugly things in array.hd and vector.hd).
For consistency reasons, the `size' and `clone' members were renamed to be called `nentries' and `copy' like in C again. Erwin 2 was not publicly released in between so this should not be noticed by anyone other than the developers.
The library was changed to support all functions in one C file. No conflicts will occur. Do do this, #define IMPLEMENTATION before #including the header files.
You can additionally define STATIC to change the storage from `extern' to `static' for all functions.
And you can additionally define INLINING to add the storage modifier `inline' to all functions usually not inlined.
This currently only works for vectors. Arrays will support this soon.
The library is not compiled anymore. Instead, the source files will be copied to where the template files are instantiated to. A configure script is included.
This makes the library incompatible again with prior versions. You need to change for Makefile to used this library. All the rest should not be touched.
The default directory layout is as follows: ./erwin - Everything this library installs will go here by default.
./erwin/include - include files
./erwin/lib/liberwin.a - the library which needs to be linked
./erwin/src - the implementation of the data structures
To use the new library, call ./erwin/configure from your configure script. In your make file, add:
erwin: dummy (cd erwin ; $(MAKE))
Ensure that your `distclean' and `clean' targets do (cd erwin ; $(MAKE) distclean) and (cd erwin ; $(MAKE) clean)
Compile with `-I erwin/include'. Link with `-L erwin/lib -lm -lerwin'.
You can change the whole directory layout, but this is not very well tested. See `untemplatize -h' for instructions how to change dirs.
All changes were done in such a way that it is still possible to compile version 1 applications (although with compiler warnings). To do this, read the section on porting below.
- string_cmp has a corrected behaviour for NULL strings now. It was documented that NULL < "", but the result was exactly the opposite in version 1. Now the functions works according to its specified semantics.
This might cause problems for upgraders.
Array_forall now works even if you nest two calls. This is done by using an additional iteration structure the user has to provide. Therefore, in C mode, the macro has one additional argument: Array_forall (ARRAY, ITERATOR, KEY, VALUE)
This will cause problems for upgraders.
Vector_forall has a changed order of arguments to make it consistent to Array_forall: Vector_forall (VECTOR, INDEX, VALUE)
A lot of additional const modifiers have been added. This might cause problems for upgraders.
oTypes equal to int need an additional define if used with dynamic ZERO elements (that is the default). This problem occurs because there are no named constructors in C++ (C++ support was added, see below). If you only compile your programs with a C compiler, this is no problem, but C++ users will have problems. See the porting section.
However, the library tries to circumvent this but it might fail.
assert.h is not included anymore. This might cause problems for upgraders.
C++ support was added to all template classes and to untemplatize. The C++ types are missing the _t at the end and use upcase notation. E.g. if vector_char_t is a C name, VectorChar will be the C++ class name. All the functions have the same names without the leading vector_int_ and will be inside the class's name space.
The following lines of C++ code virtually do the same:
C: vector_char_t *v= vector_char_new (); vector_char_append (v, '\n'); vector_char_delete (v);
C++: VectorChar *v= new VectorChar; v->append ('\n'); delete v;
Note that because of the automatic constructor call, you can use C++ to produce better code which uses less heap memory:
VectorChar v; a.append ('\n'); delete is implicit Some renamings took place partly to make use of overloading:
append_raw -> append append_string -> append append_vector -> append
insert_* -> insert overwrite_* -> overwrite
clear_flat -> clear
new_with* -> operator new
There are copy constructors for generating C++ objects from C objects and operators for converting C++ objects to C objects.
See the option description of untemplatize. Especially -cpp.
The number of bytes consumed by arrays has been reduced by the introduction of a iterator structure.
It is now possible to have a constant ZERO value (for each instance seperately). This saves some bytes for each allocated vector and array. To use it, use the following #defines (e.g. in array_u.h and vector_u.h):
For only using a constant zero value for special arrays:
#define ARRAY_INT_INT_CONSTANT_ZERO
Or for all arrays:
#define ARRAY_CONSTANT_ZERO The same for vectors:
#define VECTOR_CHAR_CONSTANT_ZERO
Or for all vectors:
#define VECTOR_CONSTANT_ZERO
Default values can be changed on a global bases. E.g. if you want to change the initial size of all hash tables, use the following define (in array_u.h):
#define ARRAY_INITIAL_SIZE 16
For a special value for other hash tables, you can still define:
#define ARRAY_INT_CHARP_SIZE 1024
Note that you can still change the default initial size when creating a new array (e.g. with array_int_charp_with_initial_size).
The same works for minimal sizes.
THE COMPATIBILITY MODE IS NOT YET IMPLEMENTED.
Step 1) untemplatize all the needed files to make them match version 2 of the library. You can take array_u.h and vector_u.h from version 1. You should, however, not use the {array,vector}_*_u.h files anymore but you should transfer your changes to these files to a freshly untemplatized copy of version 2.
Step 2)
If you want to compile your version 1 files without thinking about changes, give the compiler flag
-DERWINMM_COMPAT1
when you compile your application.
YOU MUST COMPILE THE WHOLE APPLICATION WITH THIS #DEFINE! INCLUDING THE UNTEMPLATIZED DATA STRUCTURE IMPLEMENTATIONS.
You cannot make use of some features of version 2.
If you switch on pointer type conversion warnings, you will get warnings for all source code incompatibilities between version 1 and 2.
If you want to port your code to version 2, the following things have to be changed:
In version 1 you wrote:
array_charp_int_forall (ar, key, val) { printf ("a[%s]=%d\n", key, val); }
Now in version 2 you have to declare an iterator and then change the forall call:
{ array_iterator_t iter; array_charp_int_forall (ar, iter, key, val) { printf ("a[`%s']=%d\n", key, val); } }
In version 1 you wrote:
vector_charp_forall (index, val, vec) { printf ("v[%d]=`%s'\n", index, val); }
Now in version 2 you have to write:
vector_charp_forall (vec, index, val) { printf ("v[%d]=`%s'\n", index, val); }
Analogous changes have to be made to all other forall macros.
Step 3) If you have problems with const modifiers, your compiler should hopefully still only issue warnings. If you get errors, add -Dconst= to the compiler invokation if you need to compile now or better remove the deficiencies of your code.
Step 4) If you have problems with oTypes equal to int, you need to define something like the following in your user headers:
#define VECTOR_INT_NO_INT_CONSTRUCTOR
This will prevent the generation of an additional ambiguous constructor.
Step 5) If you need it, put an #include <assert.h> into your files. This include was removed from the Erwin-- header files.
Step 6) All other changes will not be switched back to version 1 behaviour. Currently, only the behaviour of string_cmp is known to have changed noticeably.
Step 7) To enforce version 2 in a configure script, you can do something like this (note the symbol used to determine the appropriateness of the library):
AC_CHECK_LIB(erwin, erwin_version_2, LIBS="$LIBS -lerwin", echo "This library is needed."; exit 1)