User:Nick Johnson: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Nick Johnson
No edit summary
imported>Nick Johnson
No edit summary
Line 51: Line 51:
And that I've contributed to, and am proud of:
And that I've contributed to, and am proud of:
* [[buffer overflow attack]]
* [[buffer overflow attack]]
== Looking for a place to put this stuff==
==On the extensive use of pointers in C==
Since [[C (programming language)|C]] is widely used, and has been for many years, the usage of pointers in C warrants special attention here.  Although C's pointer system allows for potentially unsafe usage of pointers, C programmers can use pointers to add many features to C, such as [[information hiding]] and [[abstraction (computer science]], an [[object-oriented programming]] style and [[inheritance (computer science)|inheritance]], and [[reinterpretation typecasts]], in addition to typical data structures.
Still, many programmers would consider these techniques as [[kludge|kludges]], and would recommend higher-level languages that have full syntactic and semantic support for these concepts.
===Object-Oriented Programming, Information hiding and abstraction===
''Main Article:'' [[object-oriented programming]] [[information hiding]] and [[abstraction]]
C is not an object-oriented programming (OOP) language; but, one can still write object-oriented programs in C.  OOP, in fact, is not a language feature, but a design philosophy.  OOP languages include additional syntax and type checking that makes OOP easier in those languages.
In C, a programmer can group related information into a single object by using what is known as a structure.  For instance, shown below is an Employee object, which [[has-a]] name and salary.
<code>
/* in Employee.h */
struct Employee
{
  char *name;
  int  salary;
};
</code>
This definition will work, but can create problems when the software needs to be modified in the future.  Because of its transparent nature, programmers will tend to directly access its members.  Continuing with our example, we could assume that there will be many places throughout the whole program which attempt to read the employee's name field.
However, in the future, the programmers may decide that the Employee's name should be represented as three fields: the first, middle and last names.  Such modifications might look like this,
<code>
/* in Employee.h */
struct Employee
{
  char *firstName,
        *middleName,
        *lastName;
  int  salary;
};
</code>
Again, this definition will work.  However, because our representation has changed, many parts of the code will need to change as well.  [[software engineering|Software engineers]] have studied this problem and recommend [[information hiding]] and [[abstraction]] to reduce the about of rework necessary when a data structure changes.  In short, one should ''hide'' the internal representation of a data structure from the remainder of the program, and instead define a public interface to manipulate the data structure in a few, well defined and safe ways.
To employ information hiding, a C programmer would split the object's definition, so that only its interface is available to the remainder of the code.  For example,
<code>
/* in Employee.h */
typedef struct s_employee *Employee;
Employee employee_new(void);
void employee_delete(Employee this);
void employee_setName(Employee this, const char *newName);
char *employee_getName(Employee this);
/* in Employee.c */
struct s_employee
{
  char *firstName,
        *middleName,
        *lastName;
  int  salary;
};
/* definitions of each method follows */
</code>
This necessarily depends on the usage of pointers.  Note that in the header file Employee.h, we declare a ''forward reference'' to the s_employee structure.  The compiler allows this for syntactic convenience, and for this reason will allow the remainder of the program to manipulate pointers to the s_employee type, but not the structures directly.  As a result, any attempt to access individual fields of the object will produce compile time errors; those fields have been made private, and the internal representation has been hidden.
===Inheritance===
''Main Article:'' [[inheritance (computer science)|inheritance]]
We shall extend our earlier example of the employee object from the previous example.  Suppose that, in addition to Employee objects, we also wish to have Manager objects.  Managers will be the same as Employees, except managers additionally have Secretaries.  We want to create the Manager class as a derivative of the Employee class, and we do not want to re-write all of the methods that have already been written for the Employee class.
This can be accomplished by embedding the parent class at the beginning of the child class.  Because of how the C compiler packs structures, we can define our Manager object, but still manipulate it with some Employee methods.  For instance,
<code>
/* in Employee.h */
typedef struct s_employee *Employee;
typedef struct s_manager  *Manager;
Employee employee_new(void);
void employee_delete(Employee this);
void employee_setName(Employee this, const char *newName);
char *employee_getName(Employee this);
Manager manager_new(void);
void manager_delete(Manager this);
void manager_setSecretary(Manager this, Employee sec);
Employee manager_getSecretary(Manager this);
/* in Employee.c */
#include "Employee.h"
struct s_employee
{
  char *firstName,
        *middleName,
        *lastName;
  int  salary;
};
struct s_manager
{
  struct s_employee Me;
  Employee mySecretary;
}
/* definitions for each method follows */
</code>
This works because the first few words of the s_manager structure have the same organization as the s_employee structure.  This, a method that retrieves the salary for an Employee object will also work for a Manager object.  The programmer must be quite careful with this technique.
===Reinterpretation Typecasts===
''Main Article:'' [[reinterpretation typecast]]
Sometimes it is useful to manipulate data in ways that are not directly supported by a programming language.  Take, for instance, the problem of generating [[pseudorandom]] [[floating point]] numbers.  When performed frequently, as is the case with many [[computer simulation|simulations]], the speed of a routine to generate these numbers is paramount.
Generating a pseudorandom [[integer]] is relatively fast, however generating a pseudorandom floating point number may require a division operation, which is (relatively) slow.  One trick is to generate a pseudorandom integer and to reinterpret cast it as a floating point number, as in below:
<code>
double fastFrand()
{
  double result;
  int i, *parts = (int*) & result;
  do
    for(i=0; i< sizeof(double)/sizeof(int); i++)
      parts[i] = rand();
  while( ! goodFloatingPointNumber(result) );
  return result;
}
</code>
Where the predicate goodFloatingPointNumber() performs checks for a few boundary cases.





Revision as of 10:08, 16 April 2007

Holy crap I think I'm addicted to Citizendium. My Contributions

Background

Nick Johnson has a bachelor's degree in Computer Engineering and Applied Mathematics from the University of Virginia. He works as an embedded systems developer in industrial sensing and controls. Outside of work, he is a computer, electronics and mechanics hobbyist.

Interests

In no particular order, Nick is interested in:

  • Bikes and Cycling -- city rides, country rides, bike maintenance and repair, bike hacks.
  • Organic Gardening -- mmmmm tomatos.
  • Traveling -- trying to get to Africa for the first time this spring/summer.
  • Computer programming -- compilers for novel languaes or architectures, embedded software development, open source software movement.
  • Electronics/Mechanics -- while a computer program is cool, designing a physical device that does something will impress even non-geeks.

As well as whatever Nick forgot to mention here. In general, if you show a passion for something, Nick can relate.

Trivia

  • Nick is a vegetarian.
  • Nick is a linux geek who has been coerced into using some lesser operating system at work. Oh yes, he is a zealot.
  • Nick likes to spend vacations sleeping on the streets of foreign countries.
  • Nick has never paid for a haircut, and never will.
  • Because Nick grew up in central Virginia, USA, Nick hates cold weather. His ideal climate is about 95F and 400% humidity.

Some Quotes

When you ask a creative person how they did something, they may feel a

little guilty because they didn't really do it, they just saw something. It seemed obvious to them after awhile. That's because they were able to connect experiences they've had and synthesize new things. And the reason they were able to do that was that they've had more experiences or have thought more about their experiences than other people have.
- Steve Jobs, Wired (March, 1996)

If you need a machine and don't buy it, then you will ultimately find you have paid for it but don't have it
- Henry Ford.

My Focus on Citizendium

I'm mostly interested in writing a lot in the Computers workgroup, focusing on compilers. However, compilers are IMHO the heart of computer science, so I'm gonna touch a little bit of everything from math to theoretical cs to data structures in the process.

My articles that I'm proud of:

And that I've contributed to, and am proud of:

Looking for a place to put this stuff

On the extensive use of pointers in C

Since C is widely used, and has been for many years, the usage of pointers in C warrants special attention here. Although C's pointer system allows for potentially unsafe usage of pointers, C programmers can use pointers to add many features to C, such as information hiding and abstraction (computer science, an object-oriented programming style and inheritance, and reinterpretation typecasts, in addition to typical data structures.

Still, many programmers would consider these techniques as kludges, and would recommend higher-level languages that have full syntactic and semantic support for these concepts.


Object-Oriented Programming, Information hiding and abstraction

Main Article: object-oriented programming information hiding and abstraction

C is not an object-oriented programming (OOP) language; but, one can still write object-oriented programs in C. OOP, in fact, is not a language feature, but a design philosophy. OOP languages include additional syntax and type checking that makes OOP easier in those languages.

In C, a programmer can group related information into a single object by using what is known as a structure. For instance, shown below is an Employee object, which has-a name and salary.

/* in Employee.h */
struct Employee
{
  char *name;
  int   salary;
};

This definition will work, but can create problems when the software needs to be modified in the future. Because of its transparent nature, programmers will tend to directly access its members. Continuing with our example, we could assume that there will be many places throughout the whole program which attempt to read the employee's name field.

However, in the future, the programmers may decide that the Employee's name should be represented as three fields: the first, middle and last names. Such modifications might look like this,

/* in Employee.h */
struct Employee
{
  char *firstName,
       *middleName,
       *lastName;
  int   salary;
};

Again, this definition will work. However, because our representation has changed, many parts of the code will need to change as well. Software engineers have studied this problem and recommend information hiding and abstraction to reduce the about of rework necessary when a data structure changes. In short, one should hide the internal representation of a data structure from the remainder of the program, and instead define a public interface to manipulate the data structure in a few, well defined and safe ways.

To employ information hiding, a C programmer would split the object's definition, so that only its interface is available to the remainder of the code. For example,

/* in Employee.h */
typedef struct s_employee *Employee;

Employee employee_new(void);
void employee_delete(Employee this);
void employee_setName(Employee this, const char *newName);
char *employee_getName(Employee this);

/* in Employee.c */
struct s_employee
{
  char *firstName,
       *middleName,
       *lastName;
  int   salary;
};

/* definitions of each method follows */

This necessarily depends on the usage of pointers. Note that in the header file Employee.h, we declare a forward reference to the s_employee structure. The compiler allows this for syntactic convenience, and for this reason will allow the remainder of the program to manipulate pointers to the s_employee type, but not the structures directly. As a result, any attempt to access individual fields of the object will produce compile time errors; those fields have been made private, and the internal representation has been hidden.

Inheritance

Main Article: inheritance

We shall extend our earlier example of the employee object from the previous example. Suppose that, in addition to Employee objects, we also wish to have Manager objects. Managers will be the same as Employees, except managers additionally have Secretaries. We want to create the Manager class as a derivative of the Employee class, and we do not want to re-write all of the methods that have already been written for the Employee class.

This can be accomplished by embedding the parent class at the beginning of the child class. Because of how the C compiler packs structures, we can define our Manager object, but still manipulate it with some Employee methods. For instance,

/* in Employee.h */
typedef struct s_employee *Employee;
typedef struct s_manager  *Manager;

Employee employee_new(void);
void employee_delete(Employee this);
void employee_setName(Employee this, const char *newName);
char *employee_getName(Employee this);

Manager manager_new(void);
void manager_delete(Manager this);
void manager_setSecretary(Manager this, Employee sec);
Employee manager_getSecretary(Manager this);

/* in Employee.c */
#include "Employee.h"

struct s_employee
{
  char *firstName,
       *middleName,
       *lastName;
  int   salary;
};

struct s_manager
{
  struct s_employee Me;

  Employee mySecretary;
}

/* definitions for each method follows */

This works because the first few words of the s_manager structure have the same organization as the s_employee structure. This, a method that retrieves the salary for an Employee object will also work for a Manager object. The programmer must be quite careful with this technique.

Reinterpretation Typecasts

Main Article: reinterpretation typecast

Sometimes it is useful to manipulate data in ways that are not directly supported by a programming language. Take, for instance, the problem of generating pseudorandom floating point numbers. When performed frequently, as is the case with many simulations, the speed of a routine to generate these numbers is paramount.

Generating a pseudorandom integer is relatively fast, however generating a pseudorandom floating point number may require a division operation, which is (relatively) slow. One trick is to generate a pseudorandom integer and to reinterpret cast it as a floating point number, as in below:

double fastFrand()
{
  double result;
  int i, *parts = (int*) & result;
  do
    for(i=0; i< sizeof(double)/sizeof(int); i++)
      parts[i] = rand();
  while( ! goodFloatingPointNumber(result) ); 

  return result;
}

Where the predicate goodFloatingPointNumber() performs checks for a few boundary cases.