Polynomial/Advanced: Difference between revisions
imported>Barry R. Smith (intro, definition beginning) |
imported>Paul Wormer m (→Polynomials in one variable: Italic X in HTML math) |
||
(7 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
In [[algebra]], a '''polynomial''' is, roughly speaking, a formal expression obtained from [[ | {{subpages}} | ||
In [[algebra]], a '''polynomial''' is, roughly speaking, a formal [[expression (mathematics)|expression]] obtained from constant numbers called [[coefficient]]s and one or more [[variable]]s by making a finite number of [[addition]]s and [[multiplication]]s. For instance, <math>x^2-2x+1</math> is a polynomial involving one variable, ''x'' (often called a polynomial ''in one variable''), whereas <math>x^2+y^2</math> is a polynomial in two variables, <math>x</math> and <math>y</math>. | |||
In order to add and multiply polynomials, we need only know how to add and multiply their coefficients. In [[abstract algebra]], an abstract collection of objects that can be added and multiplied subject to the usual algebraic rules is called a [[ring (mathematics)|ring]]. Polynomials can be defined with coefficients from an arbitrary ring, although they will often exhibit unfamiliar properties not shared by the more common polynomials with [[real number]] coefficients. Polynomials with coefficients in an arbitrary ring form a ring themselves, with the usual addition and multiplication operations for polynomials, called a [[polynomial ring]]. | |||
When the polynomials have coefficients in a special type of ring, called a [[field (mathematics)|field]], then they behave in many ways similarly to polynomials with real number coefficients. Polynomials with real number coefficients are discussed on the main [[polynomial]] page. This "advanced" version discusses polynomials with coefficients in a field. | |||
== Polynomials in one variable == | == Review of fields == | ||
A field is a set of objects, numbers if you like, which can be added and multiplied. A field contains distinct [[additive identity|additive]] and [[multiplicative identity|multiplicative identities]], denoted 0 and 1. The objects in a field can also be subtracted and divided, with the usual restriction that division by 0 is not allowed. In particular, the fraction <math>\frac{a}{b}</math> is defined for any ''a'' and ''b'' in the field with <math>b \neq 0</math>. Finally, we require the addition and multiplication operations to be both [[commutative property|commutative]] and [[associative property|associative]], and for multiplication to be [[distributive property|distributive]] over addition. All of the usual algebraic rules for manipulating sums, differences, products, and quotients of numbers and fractions hold for the elements of a field. | |||
== Polynomials over a field == | |||
=== Polynomials in one variable === | |||
There are many possible equivalent approaches to defining polynomials. For instance, they can be defined as the [[convolution algebra]] of the [[monoid]] of non-negative powers of the generator ''X'' of a cyclic group. This method also allows one to define non-commuting polynomial rings, and to view polynomials in one variable as a special case. Alternatively, polynomials can be defined as [[infinite sequence]]s of coefficients such that all but a finite number of coefficients are equal to zero. This approach is useful because it allows one to view a polynomial ring as a [[subring]] of a [[ring of formal power series]]. This is the approach that will be used in this article. | There are many possible equivalent approaches to defining polynomials. For instance, they can be defined as the [[convolution algebra]] of the [[monoid]] of non-negative powers of the generator ''X'' of a cyclic group. This method also allows one to define non-commuting polynomial rings, and to view polynomials in one variable as a special case. Alternatively, polynomials can be defined as [[infinite sequence]]s of coefficients such that all but a finite number of coefficients are equal to zero. This approach is useful because it allows one to view a polynomial ring as a [[subring]] of a [[ring of formal power series]]. This is the approach that will be used in this article. | ||
Let us consider some expressions like <math>X^2-2X+1</math>, ½''X'' ³+''X''−√2 <!--<math>\frac{1}{2}X^3+X-\sqrt{2}</math>-->, or <math>2X^5-3X^2+1</math>. We can write all of them as follows: | |||
: <math>X^2-2X+1=1+(-2)X+1X^2+0X^3+0X^4+\cdots,</math> | |||
: <math>\frac{1}{2}X^3+X-\sqrt{2}=-\sqrt{2}+1X+0X^2+\frac{1}{2}X^3+0X^4+\cdots,</math> | |||
: <math>2X^5-3X^2+1=1+0X+(-3)X^2+0X^3+0X^4+2X^5+0X^6+\cdots.</math> | |||
This suggests that a polynomial can be entirely defined by giving a sequence of numbers, which are called its ''coefficients'', all of them being zero from some rank. For instance the three polynomials above can be written respectively <math>(1,-2,1,0,0,\cdots)</math>, <math>\left(-\sqrt{2},1,0,\frac{1}{2},0,\cdots\right)</math>, and <math>(1,0,-3,0,0,2,0,\cdots)</math>, the dots meaning the sequence continues with an infinity of zeros. This leads to the definition below. | |||
'''Definition.''' A ''polynomial'' <math>P</math>, over the ring <math>R</math> is a sequence <math>P=\left(a_0,a_1,a_2,\cdots,a_n,\cdots\right)</math> of elements of <math>R</math>, called the ''coefficients'' of <math>P</math>, this sequence containing only a finite number of nonzero terms. The rank of the last nonzero term is called the ''degree'' of the polynomial. | |||
Hence, the degrees of the three polynomials given above are respectively 2, 3 and 5. By convention, the degree of <math>(0,0,\cdots)</math> is set to <math>-\infty</math>. | |||
This definition may surprise the reader, because in reality, one thinks of a polynomial as an expression of the form <math>a_0+a_1X+a_2X^2+\cdots+a_nX^n</math> rather than <math>\left(a_0,a_1,a_2,\cdots ,a_n,\cdots\right)</math>. We will progressively show how to return to this usual way of writing a polynomial. First, we identify any element <math>a_0</math> of the ring to the polynomial <math>\left(a_0,0,0,\cdots\right)</math>. For instance, we write only <math>7</math> instead of the cumbersome <math>\left(7,0,0,\cdots\right)</math>, (or in the familiar fashion <math>7+0X+0X^2+\cdots</math>). | |||
Secondly, we merely denote by <math>X</math> the polynomial | |||
<center><math>X=\left(0,1,0,0,\cdots\right)</math>.</center> | |||
This is natural, as in the familiar fashion this sequence corresponds to <math>0+1X+0X^2+0X^3+\cdots</math> It remains to give a sense to <math>X^2</math>, <math>X^3</math>, etc. This will be made in the next two subsections. | |||
=== Polynomials in several variables === | |||
== Operations == | |||
We now define addition and multiplication of polynomials. | |||
==== Addition ==== | |||
With the traditional notation, if we have <math>P=2X^5-3X^2+1</math> and <math>Q=-X^5+4X^4+2X^2-1</math>, we want to have <math>P+Q=(2-1)X^5+4X^4+(-3+2)X^2+1-1=X^5+4X^4-X^2</math>, that is, one wants to add coefficients separately for each degree. This leads to the formal definition below. | |||
'''Definition.''' Given two polynomials <math>P=\left(a_0,a_1,a_2,\dots\right)</math> and <math>Q=\left(b_0,b_1,b_2,\dots\right)</math>, the sum <math>P+Q</math> is defined by <math>P+Q=\left(a_0+b_0,a_1+b_1,a_2+b_2,\dots\right)</math>. | |||
==== Multiplication ==== | |||
Multiplication is harder to define. Let us begin with an example using traditional notation. For <math>P=X^2+X-2</math> and <math>Q=2X^2-3X+1</math>, we want to have | |||
<center><math>PQ=X^2\left(2X^2-3X+1\right)+X\left(2X^2-3X+1\right)-2\left(2X^2-3X+1\right)</math>;</center> | |||
<center><math>PQ=2X^4+(-3+2)X^3+(1-3-2\cdot 2)X^2+(1-2\cdot (-3))X-2</math>;</center> | |||
<center><math>PQ=2X^4-X^3-6X^2+7X-2 \ </math>.</center> | |||
One can observe that the coefficient of say, <math>X^2</math>, is obtained by adding <math>1\cdot 1</math>, <math>1\cdot (-3)</math> and <math>-2\cdot 2</math>, that is, by adding all the <math>a_ib_j</math> so that <math>i+j=2</math>, where the <math>a_i</math> denote the coefficients of <math>P</math> and the <math>b_j</math> those of <math>Q</math>. Those mechanics lead to give the definition below. | |||
'''Definition.''' Given two polynomials <math>P=\left(a_0,a_1,a_2,\cdots\right)</math> and <math>Q=\left(b_0,b_1,b_2,\cdots\right)</math>, the product <math>PQ</math> is defined by <math>PQ=\left(c_0,c_1,c_2,\cdots\right)</math>, where for every index <math>k</math>, the coefficient <math>c_k</math> is given by <math>c_k=\sum_{i+j=k}a_ib_j</math>. | |||
The reader which is upset by those cumbersome notations should just retain that this definition allows to multiply polynomials (considered as mere sequences of coefficients) as one is used to do in elementary algebra (using the traditional notation, as in the example). The only striking fact is that in our construction, <math>X</math> does not represent a number, but a pure abstract entity for which we have defined some rules of calculation. | |||
==== The algebra <math>R[X]</math> ==== | |||
With the definition above, one can verify that the product of the polynomial <math>X=\left(0,1,0,0,\dots\right)</math> by itself, that is <math>X^2</math>, is the sequence <math>X^2=\left(0,0,1,0,0,\dots\right)</math>. More generally, for each [[natural number]] <math>n</math>, one can verify that the <math>n</math>-th power of <math>X</math> is given by | |||
<math>X^n=\left(0,\dots,0,1,0,0,\dots\right)</math>, where the <math>1</math> is the coefficient of index <math>n</math> and all other coefficients are zeros. In particular, we have the usual convention <math>X^0=\left(1,0,0,\dots\right)</math>, which we identified to the constant <math>1</math>. | |||
Now, any polynomial <math>P=\left(a_0,a_1,a_2,\dots,a_n,0,0,\dots\right)</math> is ''exactly'' equal to <math>a_0+a_1X+a_2X^2+\cdots+a_nX^n</math>, where the addition and the powers (which are mere repetitions of multiplications) are defined as in the preceding subsections. Our whole construction legitimates the traditional notation, and from now on, we will only use the later, with which calculations use natural rules of elementary algebra. It is however important to remember that the "variable" <math>X</math> did not denote some number in our construction, but a particular sequence of coefficients. We have succeeded in defining polynomials in a purely formal manner. | |||
==== Operations and degree: the algebra <math>R_n[X]</math> ==== | |||
== Polynomials versus polynomial functions == | == Polynomials versus polynomial functions == | ||
It may be convenient to think of a polynomial as a function of its variables, that is, <math>x\mapsto x^2-2x+1</math> or <math>(x,y)\mapsto x^2+y^2</math>. Such a function is called a [[polynomial function]]. But in reality, both concepts are different, the unspecified variables being purely ''formal'' entities when one thinks of an abstract polynomial, whereas they are meant to be replaced by ''any number'' when one thinks of a function. The distinction is important in [[abstract algebra]], because what we have called "constant numbers" is more generally replaced by any [[ring (mathematics)|ring]], and for some rings the two concepts cannot be identified. There is not such a problem with polynomials over rings of usual numbers like [[integer]]s, [[rational number|rational]], [[real number|real]] or [[complex number|complex]] numbers. Still it is important to understand that calculations with polynomials can be conceived in an only formal way, without giving any special ontological status to the variables. To make the distinction clear, it is common in algebra to denote the abstract variables with capital letters (<math>X</math>, <math>Y</math>, etc.), while variables of functions are still denoted with lowercase letters. We will use this convention in what follows. | It may be convenient to think of a polynomial as a function of its variables, that is, <math>x\mapsto x^2-2x+1</math> or <math>(x,y)\mapsto x^2+y^2</math>. Such a function is called a [[polynomial function]]. But in reality, both concepts are different, the unspecified variables being purely ''formal'' entities when one thinks of an abstract polynomial, whereas they are meant to be replaced by ''any number'' when one thinks of a function. The distinction is important in [[abstract algebra]], because what we have called "constant numbers" is more generally replaced by any [[ring (mathematics)|ring]], and for some rings the two concepts cannot be identified. There is not such a problem with polynomials over rings of usual numbers like [[integer]]s, [[rational number|rational]], [[real number|real]] or [[complex number|complex]] numbers. Still it is important to understand that calculations with polynomials can be conceived in an only formal way, without giving any special ontological status to the variables. To make the distinction clear, it is common in algebra to denote the abstract variables with capital letters (<math>X</math>, <math>Y</math>, etc.), while variables of functions are still denoted with lowercase letters. We will use this convention in what follows. |
Latest revision as of 09:04, 3 January 2009
In algebra, a polynomial is, roughly speaking, a formal expression obtained from constant numbers called coefficients and one or more variables by making a finite number of additions and multiplications. For instance, is a polynomial involving one variable, x (often called a polynomial in one variable), whereas is a polynomial in two variables, and .
In order to add and multiply polynomials, we need only know how to add and multiply their coefficients. In abstract algebra, an abstract collection of objects that can be added and multiplied subject to the usual algebraic rules is called a ring. Polynomials can be defined with coefficients from an arbitrary ring, although they will often exhibit unfamiliar properties not shared by the more common polynomials with real number coefficients. Polynomials with coefficients in an arbitrary ring form a ring themselves, with the usual addition and multiplication operations for polynomials, called a polynomial ring.
When the polynomials have coefficients in a special type of ring, called a field, then they behave in many ways similarly to polynomials with real number coefficients. Polynomials with real number coefficients are discussed on the main polynomial page. This "advanced" version discusses polynomials with coefficients in a field.
Review of fields
A field is a set of objects, numbers if you like, which can be added and multiplied. A field contains distinct additive and multiplicative identities, denoted 0 and 1. The objects in a field can also be subtracted and divided, with the usual restriction that division by 0 is not allowed. In particular, the fraction is defined for any a and b in the field with . Finally, we require the addition and multiplication operations to be both commutative and associative, and for multiplication to be distributive over addition. All of the usual algebraic rules for manipulating sums, differences, products, and quotients of numbers and fractions hold for the elements of a field.
Polynomials over a field
Polynomials in one variable
There are many possible equivalent approaches to defining polynomials. For instance, they can be defined as the convolution algebra of the monoid of non-negative powers of the generator X of a cyclic group. This method also allows one to define non-commuting polynomial rings, and to view polynomials in one variable as a special case. Alternatively, polynomials can be defined as infinite sequences of coefficients such that all but a finite number of coefficients are equal to zero. This approach is useful because it allows one to view a polynomial ring as a subring of a ring of formal power series. This is the approach that will be used in this article.
Let us consider some expressions like , ½X ³+X−√2 , or . We can write all of them as follows:
This suggests that a polynomial can be entirely defined by giving a sequence of numbers, which are called its coefficients, all of them being zero from some rank. For instance the three polynomials above can be written respectively , , and , the dots meaning the sequence continues with an infinity of zeros. This leads to the definition below.
Definition. A polynomial , over the ring is a sequence of elements of , called the coefficients of , this sequence containing only a finite number of nonzero terms. The rank of the last nonzero term is called the degree of the polynomial.
Hence, the degrees of the three polynomials given above are respectively 2, 3 and 5. By convention, the degree of is set to .
This definition may surprise the reader, because in reality, one thinks of a polynomial as an expression of the form rather than . We will progressively show how to return to this usual way of writing a polynomial. First, we identify any element of the ring to the polynomial . For instance, we write only instead of the cumbersome , (or in the familiar fashion ).
Secondly, we merely denote by the polynomial
This is natural, as in the familiar fashion this sequence corresponds to It remains to give a sense to , , etc. This will be made in the next two subsections.
Polynomials in several variables
Operations
We now define addition and multiplication of polynomials.
Addition
With the traditional notation, if we have and , we want to have , that is, one wants to add coefficients separately for each degree. This leads to the formal definition below.
Definition. Given two polynomials and , the sum is defined by .
Multiplication
Multiplication is harder to define. Let us begin with an example using traditional notation. For and , we want to have
One can observe that the coefficient of say, , is obtained by adding , and , that is, by adding all the so that , where the denote the coefficients of and the those of . Those mechanics lead to give the definition below.
Definition. Given two polynomials and , the product is defined by , where for every index , the coefficient is given by .
The reader which is upset by those cumbersome notations should just retain that this definition allows to multiply polynomials (considered as mere sequences of coefficients) as one is used to do in elementary algebra (using the traditional notation, as in the example). The only striking fact is that in our construction, does not represent a number, but a pure abstract entity for which we have defined some rules of calculation.
The algebra
With the definition above, one can verify that the product of the polynomial by itself, that is , is the sequence . More generally, for each natural number , one can verify that the -th power of is given by , where the is the coefficient of index and all other coefficients are zeros. In particular, we have the usual convention , which we identified to the constant .
Now, any polynomial is exactly equal to , where the addition and the powers (which are mere repetitions of multiplications) are defined as in the preceding subsections. Our whole construction legitimates the traditional notation, and from now on, we will only use the later, with which calculations use natural rules of elementary algebra. It is however important to remember that the "variable" did not denote some number in our construction, but a particular sequence of coefficients. We have succeeded in defining polynomials in a purely formal manner.
Operations and degree: the algebra
Polynomials versus polynomial functions
It may be convenient to think of a polynomial as a function of its variables, that is, or . Such a function is called a polynomial function. But in reality, both concepts are different, the unspecified variables being purely formal entities when one thinks of an abstract polynomial, whereas they are meant to be replaced by any number when one thinks of a function. The distinction is important in abstract algebra, because what we have called "constant numbers" is more generally replaced by any ring, and for some rings the two concepts cannot be identified. There is not such a problem with polynomials over rings of usual numbers like integers, rational, real or complex numbers. Still it is important to understand that calculations with polynomials can be conceived in an only formal way, without giving any special ontological status to the variables. To make the distinction clear, it is common in algebra to denote the abstract variables with capital letters (, , etc.), while variables of functions are still denoted with lowercase letters. We will use this convention in what follows.