Fortran: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Paul Wormer
imported>Paul Wormer
No edit summary
Line 1: Line 1:
{{subpages}}
{{subpages}}
'''Fortran''' ('''For'''mula '''tran'''slation) is the oldest high-level [[computer language]].  It was the first  computer language that put humans before computers, meaning that  Fortran programmers did not have to know machine instructions, but  instead could use statements that largely coincide with mathematical formulas.
'''Fortran''' ('''For'''mula '''tran'''slation) is the oldest high-level [[computer language]].  It was the first  computer language that put humans before computers, meaning that the early Fortran programmers did not have to know machine instructions, but  instead could use statements that largely coincide with mathematical formulas.  


Fortran being the oldest language, there is an enormous legacy in the scientific and engineering communities of Fortran programs, some of them as large as a million statements. It has always been the primary language for intensive (super)computing tasks, such as weather and climate modeling, oil reservoir simulation, computational fluid dynamics, computational chemistry, computational economics, and computational physics. Since the existing body of Fortran programs is too large to rewrite into a more modern language, it is likely that Fortran is here to stay.  A very noticeable feature of all  Fortran variants that have appeared  so far is their "upward compatibility". In principle,  a modern Fortran [[compiler (computer)|compiler]] can still translate a Fortran program written in, say, 1959 into the machine language of a  present day computer.
Fortran being the oldest computer language, there is an enormous legacy in the scientific and engineering communities of Fortran programs, some of them as large as a million statements. It has always been the primary language for intensive (super)computing tasks, such as weather and climate modeling, oil reservoir simulation, computational fluid dynamics, computational chemistry, computational economics, and computational physics. The existing body of Fortran programs being too large to rewrite into a more modern language, it is likely that Fortran is here to stay.  A very noticeable feature of all  Fortran variants that have appeared  to date is their "upward compatibility". In principle,  a modern Fortran [[compiler (computer)|compiler]] can still translate a Fortran program written in, say, 1959 and prepare it to run on a  present day computer.


The first version of the language was developed in 1954 by an [[IBM]] team lead by [[John Backus]]. This version, known as Fortran I, was mainly restricted to IBM computers (notably the [[IBM 704]]). In 1958 a variant, known as Fortran II, was introduced that was also implemented by other computer companies. In 1966 the first standard endorsed by the American Standards Association (now ANSI) was established, Fortran-66.   
The first version of the language was developed in 1954 by an [[IBM]] team lead by [[John Backus]]. This version, known as Fortran I, was mainly restricted to IBM computers (notably the [[IBM 704]]). In 1958 a variant, known as Fortran II, was introduced that was also implemented by other computer companies. In 1966 the first standard endorsed by the American Standards Association (now ANSI) was established: Fortran-66.   


From the end of the 1950s onward, many important computer language concepts were developed, such as [[recursion (computer)|recursion]], dynamic storage allocation, local and global variables,  if then else statements, etc. When at the end of the 1960s it was decided that a more modern variant of Fortran was needed (that later came to be known as Fortran-77), there was much pressure to extend the language with such constructs. However, this was successfully  resisted by programmers who feared that  their introduction would be  at the expense of the numerical efficiency of the language. At  that time very sophisticated  Fortran optimizing compilers were used for the numeric intensive work  and it was feared that concepts as recursion would impede the optimization.  In consequence, the Fortran-77 standard does not allow recursion, dynamic storage allocation and local variables. It does, however, allow more extended "if statements" and "loop" structures than its predecessors. Also character handling facilities were introduced into Fortran-77.   
From the end of the 1950s onward, many important computer language concepts were developed, such as [[recursion (computer)|recursion]], dynamic storage allocation, local and global variables,  if then else statements, etc. When at the end of the 1960s it was decided that a more modern variant of Fortran was needed (that later came to be known as Fortran-77), there was much pressure to extend the language with such constructs. However, this was successfully  resisted by programmers who feared that  their introduction would be  at the expense of the numerical efficiency of the language. At  that time very sophisticated  Fortran optimizing compilers were used for the numeric intensive work  and it was feared that concepts as recursion would impede the optimization.  In consequence, the Fortran-77 standard does not allow recursion, dynamic storage allocation and local variables. It does, however, allow more extended "if statements" and "loop" structures than its predecessors. Also character handling facilities were introduced into Fortran-77.   
Line 13: Line 13:


==Some features of the language prior to Fortran 90==
==Some features of the language prior to Fortran 90==
As stated earlier, much old (written before the 1990s) Fortran code is still in use. It is therefore of interest to discuss some of the features of the older versions of the language.  
As stated earlier, much old (written before the 1990s) Fortran code is still in use. It is therefore of interest to discuss some of the rules and features of the older versions of the language, many of which are still valid, even in the latest standard.  


The majority of statements are simply assignment statements of the form
The majority of statements are simply assignment statements of the form
Line 19: Line 19:
       A = expression
       A = expression
</pre>
</pre>
where the expression is first fully evaluated and then assigned to the variable named A.
where the expression on the right-hand side is first fully evaluated and then assigned to the variable named A.
===Implicit typing===
===Implicit typing===
Fortran 77 introduced character data, before that time Fortran only recognized real (floating point) numbers, integer numbers, and logicals (booleans). Names of variables have to begin with a letter.
Fortran 77 introduced character data. Before that time Fortran only recognized real (floating point) numbers, integer numbers, and logicals (booleans). Names of variables have to begin with a letter.
Only logicals have to be declared explicitly. The real and integer variables are implicitly declared by the rule: all variables with names beginning with I, J, K, L, M, and N are integer, the variables of which the names start with the other letters are real.  
Only logicals (and in Fortran-77 characters) have to be declared explicitly. The real and integer variables are implicitly declared by the rule: all variables with names beginning with I, J, K, L, M, and N are integer, the variables of which the names start with the other letters are real.  


The oldest Fortran variants only allowed variable names of maximally 6 characters, which gave rise to names as "CNVRG" for "convergence", etc. Moreover, formally Fortran was up to and including Fortran 77  strictly in capital letters.
The oldest Fortran variants only allowed variable names of maximally 6 characters, which gave rise to names as "CNVRG" for "convergence", "ITHR" for an integer threshold, etc. Moreover, formally the Fortran standard up to and including Fortran 77  prescribed capital letters.
===Fixed format===
===Fixed format===
Fortran-90 introduced free format, until that time Fortran statements have a fixed format.
Fortran-90 introduced free format, until that time Fortran statements had a fixed format.
The format of the statements was adapted to [[punch card]]s. The first column can contain the letter "C" indicating a comment card. Usually the first five columns of the card are either blank or contain a numeric label. The 6th column is preserved for a continuation  character, if any non-blank character is in this position the statement is interpreted as the continuation of the previous statement. Column 7 through 72 contains the actual statement where blanks are completely ignored, they are optionally used to improve readability by humans. Statements are not explicitly closed by a semicolon or otherwise.
The fixed format of the statements is adapted to [[punch card]]s. The first column can contain the letter "C" indicating a comment card. Apart from the comment character, the first five columns of the card are either blank or contain a numeric label. The 6th column is preserved for a continuation  character; if any non-blank character is in the 6th column, the statement is interpreted as the continuation of the previous statement. Column 7 through 72 contains the actual statement where blanks are completely ignored, they are optionally used to improve readability by humans. Statements are not closed by a semicolon or otherwise.
    
    
===Change of execution flow===
===Change of execution flow===
Line 42: Line 42:
If N &ge; M execution continues with the statement carrying the label 1234.  
If N &ge; M execution continues with the statement carrying the label 1234.  


Fortran 77 knows the "if (a) then ... else ... endif" construct, where a is either true or false. This construct does not need labels, while  prior to Fortran 77 programs were plagued by a proliferation of labels, which made them very hard to read by humans.
Fortran 77 knows the "if ('''a''') then ... else ... endif" construct, where '''a''' is either true or false. This construct does not need labels, while  prior to Fortran 77 programs were plagued by a proliferation of labels, making older Fortran programs very hard to read by humans.
===Subroutines and functions===
===Subroutines and functions===
A Fortran program can be separated into smaller independent subprograms that at load/link time are combined to one executable program. The most common subprograms are  "subroutines"  that have a number of parameters that are input, output, or both. Also  "functions", having input parameters and returning a single value, are possible. Functions are most commonly used for built-in subprograms provided by the vendor, such as sqrt (square root), etc.
A Fortran program can be partitioned into smaller independent subprograms that at load/link time are combined to one executable program. The most common subprograms are  "subroutines"  that have a number of parameters that are input, output, or both. Also  "functions", having input parameters and returning a single value, are possible. Functions are most commonly used for built-in subprograms provided by the vendor, such as sqrt (square root), etc.


===Common blocks and blank commons===
===Common blocks and blank commons===
As stated, the older Fortran's did not explicitly distinguish local and global variables. As a matter of fact, most compilers made local copies of variables, and kept arrays global.  
As stated, the older Fortran variants did not explicitly distinguish local and global variables. As a matter of fact, most compilers made local copies of variables, and kept arrays global.
 
To provide for global variables Fortran has "common blocks", either named or unnamed (blank).  These are memory areas that are assigned to the executable program at load/link time. Each subroutine, in which the common block is declared,  has read/write access to it. Because the common block is an independent programming unit, no information about it other than its starting address, is passed to  subroutine accessing it. In other words, every  subroutine makes its own assumptions about the length and the layout of the common block. It is needless to say that  common blocks in the hands of careless programmers are  almost infinite sources of confusion and programming errors.
 
===Loops===
The original form of the subscripted loop ("do loop") is
<pre>
      DO 10 I = M, N
        ....
  10  (any executable statement)
</pre>
 
The statements after the "do statement" up to and including the statement carrying the label 10 are executed, for I varying with unit steps from M to N.  Prior to Fortran 77, the loop was executed at least once, even for M = N+1, i.e., the checking on the range of I was performed at the end of the loop cycle, not at the beginning.  In the majority of cases this is not what the logic of the program dictates, so the older Fortran programs are full of constructs as
<pre>
      IF (M .GT. N) GO TO 20
      DO 10 I = M, N
        ....
  10  (any executable statement)
  20  (any executable statement)
</pre>
This was changed in Fortran 77, where by default the loop was skipped when M > N. This change was contrary to the "upward compatibility" philosophy of Fortran and therefore most compilers introduced a "single trip" flag to ask for at least a single trip through the loop, in accordance  with the older standard. Forgetting to set the flag (or not knowing that it is necessary) can cause crashed runs, or even worse, (slightly) wrong results.
===Arrays===
Basically,  arrays in the older Fortran versions are static and declared by a statement of the type
<pre>
      DIMENSION A(100,100), B(25)
</pre>
meaning that A(I,J) gives access to an element of the square array A, provided 1 &le; I &le; 100 ''and'' 1 &le; J &le; 100 and B(K), 1 &le; K &le; 25,  gives access to an element of the linear array B. Before Fortran-77 array indices had to be larger than 0. This restriction could be very cumbersome, because the index 0 appears naturally in many problems and the programmer had to shift the index by one, leading to confusion about the choice between &plusmn;1 shifts.
In Fortran 77 upper and lower bounds of arrays are unrestricted.  


To provide for global variables Fortran has "common blocks", either named or unnamed (blank).  These are memory areas that are assigned to the executable program at load time. Each subroutine, in which the common block is declared,  has read/write access to it. Because the common block is an independent programming unit, no information about it other than its starting address, is passed to  subroutine accessing it. In other words, every  subroutine makes its own assumptions about the length and the layout of the common block. It is needless to say that the common block in the hands of careless programmers is an almost infinite source of confusion and programming errors.




'''(To be continued)'''
'''(To be continued)'''

Revision as of 08:09, 31 July 2009

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
Catalogs [?]
 
This editable Main Article is under development and subject to a disclaimer.

Fortran (Formula translation) is the oldest high-level computer language. It was the first computer language that put humans before computers, meaning that the early Fortran programmers did not have to know machine instructions, but instead could use statements that largely coincide with mathematical formulas.

Fortran being the oldest computer language, there is an enormous legacy in the scientific and engineering communities of Fortran programs, some of them as large as a million statements. It has always been the primary language for intensive (super)computing tasks, such as weather and climate modeling, oil reservoir simulation, computational fluid dynamics, computational chemistry, computational economics, and computational physics. The existing body of Fortran programs being too large to rewrite into a more modern language, it is likely that Fortran is here to stay. A very noticeable feature of all Fortran variants that have appeared to date is their "upward compatibility". In principle, a modern Fortran compiler can still translate a Fortran program written in, say, 1959 and prepare it to run on a present day computer.

The first version of the language was developed in 1954 by an IBM team lead by John Backus. This version, known as Fortran I, was mainly restricted to IBM computers (notably the IBM 704). In 1958 a variant, known as Fortran II, was introduced that was also implemented by other computer companies. In 1966 the first standard endorsed by the American Standards Association (now ANSI) was established: Fortran-66.

From the end of the 1950s onward, many important computer language concepts were developed, such as recursion, dynamic storage allocation, local and global variables, if then else statements, etc. When at the end of the 1960s it was decided that a more modern variant of Fortran was needed (that later came to be known as Fortran-77), there was much pressure to extend the language with such constructs. However, this was successfully resisted by programmers who feared that their introduction would be at the expense of the numerical efficiency of the language. At that time very sophisticated Fortran optimizing compilers were used for the numeric intensive work and it was feared that concepts as recursion would impede the optimization. In consequence, the Fortran-77 standard does not allow recursion, dynamic storage allocation and local variables. It does, however, allow more extended "if statements" and "loop" structures than its predecessors. Also character handling facilities were introduced into Fortran-77.

The Fortran-90 variant, that appeared during the 1990s, allows recursion, dynamic storage allocation, case statements, and a construct that is reminiscent of objects used in object oriented languages, namely modules. Fortran-90 allows handling of arrays as one entity.

The latest standard, Fortran-2003, supports object-oriented and generic programming.

Some features of the language prior to Fortran 90

As stated earlier, much old (written before the 1990s) Fortran code is still in use. It is therefore of interest to discuss some of the rules and features of the older versions of the language, many of which are still valid, even in the latest standard.

The majority of statements are simply assignment statements of the form

      A = expression

where the expression on the right-hand side is first fully evaluated and then assigned to the variable named A.

Implicit typing

Fortran 77 introduced character data. Before that time Fortran only recognized real (floating point) numbers, integer numbers, and logicals (booleans). Names of variables have to begin with a letter. Only logicals (and in Fortran-77 characters) have to be declared explicitly. The real and integer variables are implicitly declared by the rule: all variables with names beginning with I, J, K, L, M, and N are integer, the variables of which the names start with the other letters are real.

The oldest Fortran variants only allowed variable names of maximally 6 characters, which gave rise to names as "CNVRG" for "convergence", "ITHR" for an integer threshold, etc. Moreover, formally the Fortran standard up to and including Fortran 77 prescribed capital letters.

Fixed format

Fortran-90 introduced free format, until that time Fortran statements had a fixed format. The fixed format of the statements is adapted to punch cards. The first column can contain the letter "C" indicating a comment card. Apart from the comment character, the first five columns of the card are either blank or contain a numeric label. The 6th column is preserved for a continuation character; if any non-blank character is in the 6th column, the statement is interpreted as the continuation of the previous statement. Column 7 through 72 contains the actual statement where blanks are completely ignored, they are optionally used to improve readability by humans. Statements are not closed by a semicolon or otherwise.

Change of execution flow

The oldest Fortran variants knew only the "arithmetic if". This is a statement of the form

      IF (N) 10, 20, 30

where execution continues at the statements labeled 10, 20, or 30 for N <0 , N = 0 and N > 0, respectively. Labels contained in columns 1 through 5 can be any unique set of up to 5 digits.

Later the "logical if" was added that was usually used in conjunction with a "goto" statement, for example

       IF (N .GE. M) GO TO 1234

If N ≥ M execution continues with the statement carrying the label 1234.

Fortran 77 knows the "if (a) then ... else ... endif" construct, where a is either true or false. This construct does not need labels, while prior to Fortran 77 programs were plagued by a proliferation of labels, making older Fortran programs very hard to read by humans.

Subroutines and functions

A Fortran program can be partitioned into smaller independent subprograms that at load/link time are combined to one executable program. The most common subprograms are "subroutines" that have a number of parameters that are input, output, or both. Also "functions", having input parameters and returning a single value, are possible. Functions are most commonly used for built-in subprograms provided by the vendor, such as sqrt (square root), etc.

Common blocks and blank commons

As stated, the older Fortran variants did not explicitly distinguish local and global variables. As a matter of fact, most compilers made local copies of variables, and kept arrays global.

To provide for global variables Fortran has "common blocks", either named or unnamed (blank). These are memory areas that are assigned to the executable program at load/link time. Each subroutine, in which the common block is declared, has read/write access to it. Because the common block is an independent programming unit, no information about it other than its starting address, is passed to subroutine accessing it. In other words, every subroutine makes its own assumptions about the length and the layout of the common block. It is needless to say that common blocks in the hands of careless programmers are almost infinite sources of confusion and programming errors.

Loops

The original form of the subscripted loop ("do loop") is

       DO 10 I = M, N
         ....
   10  (any executable statement)

The statements after the "do statement" up to and including the statement carrying the label 10 are executed, for I varying with unit steps from M to N. Prior to Fortran 77, the loop was executed at least once, even for M = N+1, i.e., the checking on the range of I was performed at the end of the loop cycle, not at the beginning. In the majority of cases this is not what the logic of the program dictates, so the older Fortran programs are full of constructs as

       IF (M .GT. N) GO TO 20
       DO 10 I = M, N
         ....
   10  (any executable statement)
   20  (any executable statement)

This was changed in Fortran 77, where by default the loop was skipped when M > N. This change was contrary to the "upward compatibility" philosophy of Fortran and therefore most compilers introduced a "single trip" flag to ask for at least a single trip through the loop, in accordance with the older standard. Forgetting to set the flag (or not knowing that it is necessary) can cause crashed runs, or even worse, (slightly) wrong results.

Arrays

Basically, arrays in the older Fortran versions are static and declared by a statement of the type

       DIMENSION A(100,100), B(25)

meaning that A(I,J) gives access to an element of the square array A, provided 1 ≤ I ≤ 100 and 1 ≤ J ≤ 100 and B(K), 1 ≤ K ≤ 25, gives access to an element of the linear array B. Before Fortran-77 array indices had to be larger than 0. This restriction could be very cumbersome, because the index 0 appears naturally in many problems and the programmer had to shift the index by one, leading to confusion about the choice between ±1 shifts. In Fortran 77 upper and lower bounds of arrays are unrestricted.


(To be continued)