@@ -122,7 +122,7 @@ From tgl@sss.pgh.pa.us Sun May 14 17:30:56 2000
122
122
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
123
123
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05808
124
124
for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:30:52 -0400 (EDT)
125
- Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id RAA16657 for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:29:52 -0400 (EDT)
125
+ Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.2 $) with ESMTP id RAA16657 for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:29:52 -0400 (EDT)
126
126
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
127
127
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA20914;
128
128
Sun, 14 May 2000 17:29:30 -0400 (EDT)
@@ -452,3 +452,184 @@ peter_e@gmx.net 75262 Uppsala
452
452
http://yi.org/peter-e/ Sweden
453
453
454
454
455
+ From tgl@sss.pgh.pa.us Tue Jun 13 04:58:20 2000
456
+ Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
457
+ by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24281
458
+ for <pgman@candle.pha.pa.us>; Tue, 13 Jun 2000 03:58:18 -0400 (EDT)
459
+ Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
460
+ by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA02571;
461
+ Tue, 13 Jun 2000 03:58:43 -0400 (EDT)
462
+ To: Bruce Momjian <pgman@candle.pha.pa.us>
463
+ cc: pgsql-hackers@postgresql.org
464
+ Subject: Re: [HACKERS] Proposal for fixing numeric type-resolution issues
465
+ In-reply-to: <200006130741.DAA23502@candle.pha.pa.us>
466
+ References: <200006130741.DAA23502@candle.pha.pa.us>
467
+ Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
468
+ message dated "Tue, 13 Jun 2000 03:41:56 -0400"
469
+ Date: Tue, 13 Jun 2000 03:58:43 -0400
470
+ Message-ID: <2568.960883123@sss.pgh.pa.us>
471
+ From: Tom Lane <tgl@sss.pgh.pa.us>
472
+ Status: OR
473
+
474
+ Bruce Momjian <pgman@candle.pha.pa.us> writes:
475
+ > Again, anything to add to the TODO here?
476
+
477
+ IIRC, there was some unhappiness with the proposal you quote, so I'm
478
+ not sure we've quite agreed what to do... but clearly something must
479
+ be done.
480
+
481
+ regards, tom lane
482
+
483
+
484
+ >> We've got a collection of problems that are related to the parser's
485
+ >> inability to make good type-resolution choices for numeric constants.
486
+ >> In some cases you get a hard error; for example "NumericVar + 4.4"
487
+ >> yields
488
+ >> ERROR: Unable to identify an operator '+' for types 'numeric' and 'float8'
489
+ >> You will have to retype this query using an explicit cast
490
+ >> because "4.4" is initially typed as float8 and the system can't figure
491
+ >> out whether to use numeric or float8 addition. A more subtle problem
492
+ >> is that a query like "... WHERE Int2Var < 42" is unable to make use of
493
+ >> an index on the int2 column: 42 is resolved as int4, so the operator
494
+ >> is int24lt, which works but is not in the opclass of an int2 index.
495
+ >>
496
+ >> Here is a proposal for fixing these problems. I think we could get this
497
+ >> done for 7.1 if people like it.
498
+ >>
499
+ >> The basic problem is that there's not enough smarts in the type resolver
500
+ >> about the interrelationships of the numeric datatypes. All it has is
501
+ >> a concept of a most-preferred type within the category of numeric types.
502
+ >> (We are abusing the most-preferred-type mechanism, BTW, because both
503
+ >> FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
504
+ >> category! This is in fact why the resolver can't make a choice for
505
+ >> "numeric+float8".) We need more intelligence than that.
506
+ >>
507
+ >> I propose that we set up a strictly-ordered hierarchy of numeric
508
+ >> datatypes, running from least preferred to most preferred:
509
+ >> int2, int4, int8, numeric, float4, float8.
510
+ >> Rather than simply considering coercions to the most-preferred type,
511
+ >> the type resolver should use the following rules:
512
+ >>
513
+ >> 1. No value will be down-converted (eg int4 to int2) except by an
514
+ >> explicit conversion.
515
+ >>
516
+ >> 2. If there is not an exact matching operator, numeric values will be
517
+ >> up-converted to the highest numeric datatype present among the operator
518
+ >> or function's arguments. For example, given "int2 + int8" we'd up-
519
+ >> convert the int2 to int8 and apply int8 addition.
520
+ >>
521
+ >> The final piece of the puzzle is that the type initially assigned to
522
+ >> an undecorated numeric constant should be NUMERIC if it contains a
523
+ >> decimal point or exponent, and otherwise the smallest of int2, int4,
524
+ >> int8, NUMERIC that will represent it. This is a considerable change
525
+ >> from the current lexer behavior, where you get either int4 or float8.
526
+ >>
527
+ >> For example, given "NumericVar + 4.4", the constant 4.4 will initially
528
+ >> be assigned type NUMERIC, we will resolve the operator as numeric plus,
529
+ >> and everything's fine. Given "Float8Var + 4.4", the constant is still
530
+ >> initially numeric, but will be up-converted to float8 so that float8
531
+ >> addition can be used. The end result is the same as in traditional
532
+ >> Postgres: you get float8 addition. Given "Int2Var < 42", the constant
533
+ >> is initially typed as int2, since it fits, and we end up selecting
534
+ >> int2lt, thereby allowing use of an int2 index. (On the other hand,
535
+ >> given "Int2Var < 100000", we'd end up using int4lt, which is correct
536
+ >> to avoid overflow.)
537
+ >>
538
+ >> A couple of crucial subtleties here:
539
+ >>
540
+ >> 1. We are assuming that the parser or optimizer will constant-fold
541
+ >> any conversion functions that are introduced. Thus, in the
542
+ >> "Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
543
+ >> time execution begins, so there's no performance loss.
544
+ >>
545
+ >> 2. We cannot lose precision by initially representing a constant as
546
+ >> numeric and later converting it to float. Nor can we exceed NUMERIC's
547
+ >> range (the default 1000-digit limit is more than the range of IEEE
548
+ >> float8 data). It would not work as well to start out by representing
549
+ >> a constant as float and then converting it to numeric.
550
+ >>
551
+ >> Presently, the pg_proc and pg_operator tables contain a pretty fair
552
+ >> collection of cross-datatype numeric operators, such as int24lt,
553
+ >> float48pl, etc. We could perhaps leave these in, but I believe that
554
+ >> it is better to remove them. For example, if int42lt is left in place,
555
+ >> then it would capture cases like "Int4Var < 42", whereas we need that
556
+ >> to be translated to int4lt so that an int4 index can be used. Removing
557
+ >> these operators will eliminate some code bloat and system-catalog bloat
558
+ >> to boot.
559
+ >>
560
+ >> As far as I can tell, this proposal is almost compatible with the rules
561
+ >> given in SQL92: in particular, SQL92 specifies that an operator having
562
+ >> both "approximate numeric" (float) and "exact numeric" (int or numeric)
563
+ >> inputs should deliver an approximate-numeric result. I propose
564
+ >> deviating from SQL92 in a single respect: SQL92 specifies that a
565
+ >> constant containing an exponent (eg 1.2E34) is approximate numeric,
566
+ >> which implies that the result of an operator using it is approximate
567
+ >> even if the other operand is exact. I believe it's better to treat
568
+ >> such a constant as exact (ie, type NUMERIC) and only convert it to
569
+ >> float if the other operand is float. Without doing that, an assignment
570
+ >> like
571
+ >> UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
572
+ >> will not work as desired because the constant will be prematurely
573
+ >> coerced to float, causing precision loss.
574
+ >>
575
+ >> Comments?
576
+ >>
577
+ >> regards, tom lane
578
+ >>
579
+
580
+
581
+ > --
582
+ > Bruce Momjian | http://www.op.net/~candle
583
+ > pgman@candle.pha.pa.us | (610) 853-3000
584
+ > + If your life is a hard drive, | 830 Blythe Avenue
585
+ > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
586
+
587
+ From tgl@sss.pgh.pa.us Mon Jun 12 14:09:45 2000
588
+ Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
589
+ by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01993
590
+ for <pgman@candle.pha.pa.us>; Mon, 12 Jun 2000 13:09:43 -0400 (EDT)
591
+ Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
592
+ by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA01515;
593
+ Mon, 12 Jun 2000 13:10:01 -0400 (EDT)
594
+ To: Peter Eisentraut <peter_e@gmx.net>
595
+ cc: Bruce Momjian <pgman@candle.pha.pa.us>,
596
+ "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>,
597
+ PostgreSQL-development <pgsql-hackers@postgresql.org>
598
+ Subject: Re: [HACKERS] Adding time to DATE type
599
+ In-reply-to: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
600
+ References: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
601
+ Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
602
+ message dated "Sun, 11 Jun 2000 13:41:24 +0200"
603
+ Date: Mon, 12 Jun 2000 13:10:00 -0400
604
+ Message-ID: <1512.960829800@sss.pgh.pa.us>
605
+ From: Tom Lane <tgl@sss.pgh.pa.us>
606
+ Status: ORr
607
+
608
+ Peter Eisentraut <peter_e@gmx.net> writes:
609
+ > Bruce Momjian writes:
610
+ >> Can someone give me a TODO summary for this issue?
611
+
612
+ > * make 'text' constants default to text type (not unknown)
613
+
614
+ > (I think not everyone's completely convinced on this issue, but I don't
615
+ > recall anyone being firmly opposed to it.)
616
+
617
+ It would be a mistake to eliminate the distinction between unknown and
618
+ text. See for example my just-posted response to John Cochran on
619
+ pgsql-general about why 'BOULEVARD'::text behaves differently from
620
+ 'BOULEVARD'::char. If string literals are immediately assigned type
621
+ text then we will have serious problems with char(n) fields.
622
+
623
+ I think it's fine to assign string literals a type of 'unknown'
624
+ initially. What we need to do is add a phase of type resolution that
625
+ considers treating them as text, but only after the existing logic fails
626
+ to deduce a type.
627
+
628
+ (BTW it might be better to treat string literals as defaulting to char(n)
629
+ instead of text, allowing the normal promotion rules to replace char(n)
630
+ with text if necessary. Not sure if that would make things more or less
631
+ confusing for operations that intermix fixed- and variable-width char
632
+ types.)
633
+
634
+ regards, tom lane
635
+
0 commit comments