Jump to content

Future-safing C code?


Quasar

Recommended Posts

Graf Zahl said:

C++ will be 'safe' unless they remove reinterpret_cast from the language. That one's existence clearly tells that the designers knew well enough that serious programmers need such options.

If the types inherit you should be able to use dynamic_cast, though I understand some people try to avoid dynamic_cast because it requires C++ RTTI to be enabled, which probably adds at least a pointer's worth of overhead to the invisible size of each class instance, since it has to point back to the RTTI record somewhere.

However dynamic_cast is safer as you can use it to attempt casts, and if the attempt fails, the pointer returned will be NULL, making it easy to check-and-cast, ex:

if((foo = dynamic_cast<Foo *>(bar)))
{ 
  foo->baz = 10; // this can't run unless it's really an instance of Foo
}

Share this post


Link to post
  • Replies 128
  • Created
  • Last Reply

Top Posters In This Topic

I know. I just meant that as long as reinterpret_cast exists there's no danger that C++ can go the same route as C99 and lose all options of pointer hackery.

Share this post


Link to post

Possible solution that I just thought up, haven't tested yet and probably doesn't work:

Make the thinkers in mobj_t and other structs into pointers and add a (void) pointer to the thinker_t struct that points to the object it represents. This way the position of the thinker field is unimportant and casting the pointer in the thinker is just a normal cast, no reliance on aliasing and the compiler.

It would require a few extra bytes of memory and an extra allocation for everything using a thinker, but I think it may work. I'm convinced that a C solution can be found.

Example:

typedef struct mobj_s
{
    // allocate when mobj is created
    thinker_t*		thinker;

} mobj_t;

typedef struct thinker_s
{

    //pointer to mobj_t etc...
    void* object; 
  
} thinker_t;

void P_AddThinker (thinker_t** thinker,void* object)
{
    *thinker = Z_Alloc( whatever );  //new
    thinkercap.prev->next = *thinker;
    (*thinker)->next = &thinkercap;
    (*thinker)->prev = thinkercap.prev;
    thinkercap.prev = *thinker;
    (*thinker)->object = object;  //new
}

void P_RunThinkers (void)
{
    //rest of code is the same
    if ( currentthinker->function.acv == (actionf_v)(-1) )
    {
        // time to remove it
        currentthinker->next->prev = currentthinker->prev;
        currentthinker->prev->next = currentthinker->next;
        Z_Free(currentthinker->object); //new
        Z_Free(currentthinker);
    }
    //rest of code is the same
}

void P_RemoveMobj (mobj_t* mobj)
{
    //rest of code is the same
    // free block
    thinkers_remove(mobj->thinker);  //changed
}

Share this post


Link to post

And yet another workaround to solve a problem that shouldn't be there.

I think you are completely missing the point. It's not looking for more or less clever ways to work around this pointless limitation. Quasar's concerns are that in the future there might be other crippling stupidities like this one introduced into C compilers.

Besides, GCC does have a command line switch to disable this feature so currently there's really no need to change the code. What's this obsession with making it compile with GCC without any command line options? Normally these should be set automatically in the makefile and be transparent to the end user who wants to compile the source.

Share this post


Link to post
Scet said:

Possible solution that I just thought up, haven't tested yet and probably doesn't work:

Make the thinkers in mobj_t and other structs into pointers and add a (void) pointer to the thinker_t struct that points to the object it represents. This way the position of the thinker field is unimportant and casting the pointer in the thinker is just a normal cast, no reliance on aliasing and the compiler.

It would require a few extra bytes of memory and an extra allocation for everything using a thinker, but I think it may work. I'm convinced that a C solution can be found.

Example:
*snip*

If you access the same memory through two pointers of *any* type not allowed to alias, including by casting a void *, the compiler's optimizer may end up generating code that does not make sense, and that's really what this strict aliasing bullshit is about. Here's an example from EE which violates strict aliasing in order to use an inheritance mechanism:

static void metaStringCopy(metatype_t *t, void *dest, const void *src)
{
   metastring_t *newString = (metastring_t *)dest;

   // invoke parent implementation, which will copy the entire object
   t->super->methods.copy(t, dest, src);

   // make string value a copy of the existing one
   newString->value = strdup(newString->value);
}
When dest is sent in, it is a metastring_t, always. The first line generates a warning that dereferencing the pointer may break strict aliasing rules. Do you see why? The contents of dest are altered by the call to the superclass copy method, which in fact is just like a memcpy.

But then below I use newString->value. The compiler, trying to be clever, maybe thinks the value of newString->value never changes in this function simply because it sees no access to it between the assignment and the memcpy, so it decides it is faster read the value of newString->value first and store it in a register. Then down below it calls strdup(NULL) and crashes the program.

Brilliant.

Share this post


Link to post

Would it recognize if you got an integer pointer to a member variable and at the same time access the entire struct through another pointer?

If the compiler doesn't even catch such a simple case I'd rather avoid such an optimization feature altogether. It probably causes more problems than performance improvements, even in well written code...

Share this post


Link to post
Graf Zahl said:

Would it recognize if you got an integer pointer to a member variable and at the same time access the entire struct through another pointer?

If the compiler doesn't even catch such a simple case I'd rather avoid such an optimization feature altogether. It probably causes more problems than performance improvements, even in well written code...

That's a good question. I wonder how you can have aliasing pointers at all when the compiler is making assumptions that pointers don't alias just because they're of different types. Raises all kinds of questions about how they can ever be sure they're generating valid code. Here's why I think the entire thing is a folly that breaks down the most fundamental functionality of the C language. You can't take a language that allows arbitrary memory addresses to be manipulated and then try to impose a one-pointer-per-address view of the code onto it later with an optimizer. Something is going to break.

I wouldn't be surprised at all if GCC puked in the circumstance you're describing. If it doesn't that's some serious magic they're applying to figure out that the pointers could affect, indirectly, the same memory locations.

Like for example when you take the address of a structure member it marks the pointer as having a type annotation related to the structure type, and then treats pointers of that type as if they might alias.

But then you wonder, what happens if you do something overly clever like this:

// Some function in module A:
void foo(struct_t *bar, int *baz)
{
  memset(bar, 0, sizeof(*bar));
  *baz++;
}

// Some function in module B that won't inline A:
void blah(void)
{
  struct_t mystruct;
  foo(&mystruct, &mystruct.x);
  printf("%d\n", mystruct.x); // Prints 1, or uninitialized garbage?
}
If the latter is not safe, then C is now fundamentally broken, as you may need to have independent references to the insides of data structures which incidentally get manipulated at the same time, invisibly, in ways far more complicated and fundamentally less contrived than this example.

Share this post


Link to post

It sounds like this is really an issue with GCCs optimizer rather then the C language itself, of course it doesn't help that GCC is the most common compiler on all non-MS platforms.

Share this post


Link to post
Scet said:

It sounds like this is really an issue with GCCs optimizer rather then the C language itself, of course it doesn't help that GCC is the most common compiler on all non-MS platforms.

GCC is the only compiler to take this route with its optimizer so far, but this language was placed into the C99 standard expressly for the purpose of allowing such shenanigans. Optimizer writers have salivated over this idea for years because it does (sometimes) offer dramatic improvements. But geez what a cost.

Share this post


Link to post
  • 2 weeks later...
  • 4 weeks later...

I always wondered why you stuck to C for so long. :P

Nothing against the language but I have been thinking for 15+ years that it's not that great for larger projects (and concerning ZDoom I know that much it does would be very cumbersome to do if it still was C.)

Share this post


Link to post

Then again there's D, the language that -alledegly- will put C, C++, C# and Java out of business (even though I don't see how).

Share this post


Link to post

D is not bad. At least it looks a lot cleaner than C++ which, to be honest, is quite messed up in parts, particularly the multiple inheritance model.

I'd say it has the same chances as Objective-C, namely, it could succeed if some OS implementor chooses it as the native language of choice. But without that - not a chance in the foreseeable future.

Share this post


Link to post

I read about D one time, it was very intriguing and did seem like it would be a good language to do game programming in. Of course you need it to have the same portability as C++, otherwise you're sacrificing too much.

Share this post


Link to post
Quasar said:

I'm worried about what will happen with evolving C standards in the future and compiler support for code which isn't fully compliant.

Quasar said:

Currently EE no longer successfully executes if the -fno-strict-aliasing flag is removed and -O2 or higher are used, with GCC 4.5. This is due to the "strict aliasing" requirement of the C99 standard which abruptly killed off the ability to cast so-called "unrelated type" pointers. No exception was made in the standard for similar structures (ie, structures that begin with an identical prologue of fields), and no exception was made for void *.


Strict aliasing allows for faster optimizations. In GCC it will usually warn you when something breaks strict aliasing rules or will break it.

Quasar said:

Basically the ability to do structured or object-oriented programming in C has been butchered, and I'm concerned that support for C90 or prior code that uses these once-valid constructs will eventually disappear. GCC is probably pretty safe for the foreseeable future due to the fact that prominent projects such as the Linux kernel depend on the switch. But what about commercial products like Visual C++, which typically have less customizability when it comes to standards-compliance and optimization behavior?


Microsoft has stated that it has no plans to support C99 with Visual C++, they are all in some ISO C++ land.

Quasar said:

Moving to C++ is a partial solution, but this would break as many things as it would fix. For example the zone system is not compatible with C++ objects due to the way in which C++ makes assumptions about alignment when generating code to access members of class instances. So even if we go C++ and convert all our type-punning code into class heirarchies, we're suddenly cut off from our allocator, and its support for domain-specific lifetimes and garbage-collected auto allocs.


You could overload new and free and make every class extend off that class, kind of like a Java superclass but it's a bit messy.

Quasar said:

So what do we do? I think we've become a project stuck between languages with no safe place to turn.


Stick with C89 or C90, with the shit loads of code out there today removing support for it would be suicide. GCC requires a flag to compile in C99 mode.

Now if you move to C++, what about C++0x!?! It's quite possible that will break your C++ code.

Also if you are going to do C++, do real C++ really. Don't hide C code in C++ code. I'm talking about going pure OOP and not some junky non-OOP implementation of it all.

Also depending on the C++ features you use there will be more overhead so your program will run slightly slower.

Share this post


Link to post
GhostlyDeath said:

pure OOP



That's often as bad as no OOP. Programmers should use what works best, not force their code into one scheme, if the language allows some flexibility.

BTW, if you take a look at Eternity's SVN you'd see that Quasar is using OOP features where it makes sense.

Share this post


Link to post
Graf Zahl said:

That's often as bad as no OOP. Programmers should use what works best, not force their code into one scheme, if the language allows some flexibility.

BTW, if you take a look at Eternity's SVN you'd see that Quasar is using OOP features where it makes sense.

Yep, I intend the initial conversion to include only what's necessary to wipe out all use of type punning inherited from the C implementations - the majority of this work has been spent on the thinker system. The rest has been invested into the metatable and its chain of dependencies (ehash, mdllist). Those two were implemented as generic algorithms in C using type punning, and so they've been transformed into template classes in C++.

EE won't magically transform into a Nazi OO program of pure non-POD objects that all goose-step to the beat of virtual methods, strict encapsulation, and RAII idioms.

Just in the interest of efficiency alone, even some of the classes, like CDLListItem, will remain deliberately designed as PODs for the foreseeable future.

Share this post


Link to post
Quasar said:

EE won't magically transform into a Nazi OO program of pure non-POD objects that all goose-step to the beat of virtual methods, strict encapsulation, and RAII idioms.


Hah! You ain't seen Nazi OO until you start using Factories of factories that are used by other factories ;-)

On the bright side, if you take a look at e.g. Mocha Doom you'll see it's nowhere near as Nazi OO as you'd expect -I use Objects, interfaces and abstract classes where it makes sense, not just for the sake of it. In comparison, e.g. ZDoom is much more hard to follow and is probably much more objectified even compared to something like Jake2.

Share this post


Link to post
Maes said:

Hah! You ain't seen Nazi OO until you start using Factories of factories that are used by other factories ;-)

When you want to sum array of floats you must use well designed understandable C++ code:

template <typename T, typename Tdat, typename Policy>
struct Accumulate
{  
  template <uint shiff>
  static float up(Tdat *a, const uint &n1, const uint &n2) 
  {     
    T sum;
    if (shiff*2 < n2 - n1)
      sum = Accumulate::up<shiff*2>(a, n1,  n2);
    else
      sum = Accumulate::down<shiff>(a, n1, n1, n2-1);    
    return sum;
  }
  template <uint shiff>
  static T down(Tdat *a, const uint  i, const uint &n1, const uint &n2) 
  {
    T sum = T();
    Policy::result(sum, Accumulate::down<shiff/2>(a, i, n1, n2)); 
    if (i + shiff <= n2)  
      Policy::result(sum, Accumulate::down<shiff/2>(a, i + shiff, n1, n2));    
    return sum;
  }
  template <>
  static T down<1>(Tdat *a, const uint  i, const uint &n1, const uint &n2) 
  {
    T sum = T(); 
    Policy::result(sum, a[i]);     
    if (i < n2)
      Policy::result(sum, a[i + 1]);
    return sum;
  }
};

struct SumPolicy
{
  template <typename T>
  static void result(T &res, T x)  {res += x;}            
};

summ = Accumulate<float, float, SumPolicy>::up<1>(a, n0, n);
NOT SOMETHING STUPID LIKE THAT!
  float summ = 0;
  for (int i = 0; i < n; i++)
    summ += a[i];

Share this post


Link to post

You KNOW you need factories of Policy and Accumulate, else you ain't shit.

Share this post


Link to post
Graf Zahl said:

WTF???

int main(int argc, char* argv[])
{
  unsigned int i;
  float f;
  
  f = 0.0f;
  for (i = 0; i < 100000000; i++)
    f += 1.0f;
  printf("summ1: %f\n", f);

  f = 0.0f;
  for (i = 0; i < 1000000; i++)
    f += 100.0f;
  printf("summ2: %f\n", f);
  
  getchar();
  return 0;
}
summ1: 16777216.000000
summ2: 98684352.000000

Share this post


Link to post

LOL you reminded me of how large roundoff errors can become with floats. Try adding up 1.01 100000 times, and display the difference from n*1.01 where n=1,2,3 ..... 100000 ;-)

It will go up all the way to +/- 800 if I recall, and it will also vary periodically with accumulation. There will be points of "error minima" and "error maxima", so in multithreaded accumulators selecting a particular number of threads may throw you off more than others :-p

Share this post


Link to post

I just read through this (too fast) so don't jump all over me for this quick reply (of the top of my head).

Just went through some of this in the latest DoomLegacy code:
* Planned to go to C99 standard but have some more C99 compatibilities to fix, one of which was the use of stack alloc not being inherently supported,
and there may of been some ptr issues too. Right now we are staying with the default which I believe to be C90.

* We have gone through almost every structure and fixed code that was endian or compiler word-size sensitive (I run a 32bit machine, one other developer runs a 64bit machine).
* Code that required multiple structures to have the same fields was
fixed. I do not trust compilers to always align the fields in such structures identically.
Unions are used where needed, as they are the best solution in C for
some problems.

* The improved optimizer in Gcc is what needs to know about ptr aliasing. Keeping data in registers in machines with large register
sets is the potential gain.
Spend a few days reading the info pages and you will probably find a switch to turn off whichever optimizer feature is giving you grief. You could also try -pendantic to disable some Gcc specific improvements.
We changed some of our compile switches recently (cannot remember which right now, or why).

The GNU info pages are chapter-organized to the point of hindering their usefulness, and Gcc info is only slightly better that BASH.
You really have to search, usually by going to the keyword index and flail around, or by a search with a known keyword. Going in through the table of contents will usually lead you in the wrong circles.

I program embedded systems, and one thing I do not allow in OO-like
code are the use of getters/setters. They do not accomplish anything
except a formalizing of the interface, which could also be done
with a few comments and some discipline, and without the code bloat.

Now I will have to revisit the C99 issue in DoomLegacy to see exactly
what issues were remaining and which got solved.

The DoomLegacy C++ version (2.0) is mostly done, but got bogged down because those team members got wore out. You could expect the same
if you try that. It is always three to four times the work that it is predicted. Whether it is worth it depends upon how much extensive work you are planning to do to EE and how much a new code rewrite would help with that. It is going to introduce dozens of new bugs into the code. I work on old commercial code and rewriting it
in a new standard exposes so many compatibility issues that I would not recommend it unless it has the potential to remove an equal number of existing issues (which it usually does not).

Share this post


Link to post
entryway said:

floating point numbers...

The number of bits required to store your result 100000000 exactly = log2(100000000) = 26.5 = 27

32-bit floats don't have enough bits

Share this post


Link to post
wesleyjohnson said:

The DoomLegacy C++ version (2.0) is mostly done, but got bogged down because those team members got wore out.



Actually, the problem here was that too much was done all at once. If you want to rewrite the entire engine, that's fine, of course, but you go after it subsystem by subsystem and making frequent beta releases so that the whole thing gets tested in real life. You start at the bottom and work yourself up gradually.

You do not change everything all at once and hope for the best - because that will never ever happen.

I wish you good luck with Legacy 2.0 but to be honest, I still expect it to go through a very rough period in which all the bugs have to be found and fixed.


There's some code in ZDoom where I'd like to do the same but in the end it has to be worth it. For example, the automap could really use a rewrite - but what would it accomplish? The end result would still look the same and not offer any significant improvement so it's a task for a dull period in which I have nothing better to do.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...