Lists: | pgsql-hackers |
---|
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Hash support for row types |
Date: | 2020-10-19 08:01:14 |
Message-ID: | 38eccd35-4e2d-6767-1b3c-dada1eac3124@2ndquadrant.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
In [0] it was discussed that hash support for row types/record would be
handy. So I implemented that.
The implementation hashes each field and combines the hash values. Most
of the code structure can be borrowed from the record comparison
functions/btree support. To combine the hash values, I adapted the code
from the array hashing functions. (The hash_combine()/hash_combine64()
functions also looked sensible, but they don't appear to work in a way
that satisfies the hash_func regression test. Could be documented better.)
The main motivation is to support UNION [DISTINCT] as discussed in [0],
but this also enables other hash-related functionality such as hash
joins (as one regression test accidentally revealed) and hash partitioning.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Hash-support-for-row-types.patch | text/plain | 19.2 KB |
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hash support for row types |
Date: | 2020-10-19 23:32:34 |
Message-ID: | 20201019233234.r6lyxbvdg5s77rvd@alap3.anarazel.de |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 2020-10-19 10:01:14 +0200, Peter Eisentraut wrote:
> In [0] it was discussed that hash support for row types/record would be
> handy. So I implemented that.
> The implementation hashes each field and combines the hash values. Most of
> the code structure can be borrowed from the record comparison
> functions/btree support. To combine the hash values, I adapted the code
> from the array hashing functions. (The hash_combine()/hash_combine64()
> functions also looked sensible, but they don't appear to work in a way that
> satisfies the hash_func regression test. Could be documented better.)
>
> The main motivation is to support UNION [DISTINCT] as discussed in [0], but
> this also enables other hash-related functionality such as hash joins (as
> one regression test accidentally revealed) and hash partitioning.
How does this deal with row types with a field that doesn't have a hash
function? Erroring out at runtime could cause queries that used to
succeed, e.g. because all fields have btree ops, to fail, if we just have
a generic unconditionally present hash opclass? Is that an OK
"regression"?
Greetings,
Andres Freund
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hash support for row types |
Date: | 2020-10-20 15:10:24 |
Message-ID: | e88b1974-5987-9ae8-7e54-7d66f54751cb@2ndquadrant.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-10-20 01:32, Andres Freund wrote:
> How does this deal with row types with a field that doesn't have a hash
> function? Erroring out at runtime could cause queries that used to
> succeed, e.g. because all fields have btree ops, to fail, if we just have
> a generic unconditionally present hash opclass? Is that an OK
> "regression"?
Good point. There is actually code in the type cache that is supposed
to handle that, so I'll need to adjust that.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hash support for row types |
Date: | 2020-10-20 18:41:08 |
Message-ID: | CA+TgmobT6e4uV6JNMND7oM23Q8HUAgyEP4R1w9-G9w9jT0stDg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, Oct 20, 2020 at 11:10 AM Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> On 2020-10-20 01:32, Andres Freund wrote:
> > How does this deal with row types with a field that doesn't have a hash
> > function? Erroring out at runtime could cause queries that used to
> > succeed, e.g. because all fields have btree ops, to fail, if we just have
> > a generic unconditionally present hash opclass? Is that an OK
> > "regression"?
>
> Good point. There is actually code in the type cache that is supposed
> to handle that, so I'll need to adjust that.
Do we need to worry about what happens if somebody modifies the
opclass/opfamily definitions?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hash support for row types |
Date: | 2020-10-20 20:36:25 |
Message-ID: | 676046.1603226185@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Do we need to worry about what happens if somebody modifies the
> opclass/opfamily definitions?
There's a lot of places that you can break by doing that. I'm not
too concerned about it.
regards, tom lane
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Hash support for row types |
Date: | 2020-10-23 07:49:15 |
Message-ID: | f7195e93-ab9d-01f0-1140-361aece79248@2ndquadrant.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-10-20 17:10, Peter Eisentraut wrote:
> On 2020-10-20 01:32, Andres Freund wrote:
>> How does this deal with row types with a field that doesn't have a hash
>> function? Erroring out at runtime could cause queries that used to
>> succeed, e.g. because all fields have btree ops, to fail, if we just have
>> a generic unconditionally present hash opclass? Is that an OK
>> "regression"?
>
> Good point. There is actually code in the type cache that is supposed
> to handle that, so I'll need to adjust that.
Here is an updated patch with the type cache integration added.
To your point, this now checks each fields hashability before
considering a row type as hashable. It can still have run-time errors
for untyped record datums, but that's not something we can do anything
about.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Hash-support-for-row-types.patch | text/plain | 26.5 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Hash support for row types |
Date: | 2020-11-13 19:51:48 |
Message-ID: | 395784.1605297108@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> Here is an updated patch with the type cache integration added.
> To your point, this now checks each fields hashability before
> considering a row type as hashable. It can still have run-time errors
> for untyped record datums, but that's not something we can do anything
> about.
This looks good code-wise. A couple small niggles on the tests:
* The new test in with.sql claims to be testing row hashing, but
it's not very apparent that any such thing actually happens. Maybe
EXPLAIN the query, as well as execute it, to confirm that a
hash-based plan is used.
* Is it worth devising a test case in which hashing is not possible
because one of the columns isn't hashable? I have mixed feelings
about this because the set of suitable column types may decrease
to empty over time, making it hard to maintain the test case.
I marked it RFC.
regards, tom lane
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Hash support for row types |
Date: | 2020-11-17 13:25:59 |
Message-ID: | d1a898b2-cd3d-0519-def9-5f38163c68e8@2ndquadrant.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
I wrote a new patch to add a lot more tests around hash-based plans.
This is intended to apply separately from the other patch, and the other
patch would then "flip" some of the test cases.
On 2020-11-13 20:51, Tom Lane wrote:
> * The new test in with.sql claims to be testing row hashing, but
> it's not very apparent that any such thing actually happens. Maybe
> EXPLAIN the query, as well as execute it, to confirm that a
> hash-based plan is used.
The recursive union requires hashing, but this is not visible in the
plan. You only get an error if there is no hashing support for a type.
I have added a test for this.
For the non-recursive union, I have added more tests that show this in
the plans.
> * Is it worth devising a test case in which hashing is not possible
> because one of the columns isn't hashable? I have mixed feelings
> about this because the set of suitable column types may decrease
> to empty over time, making it hard to maintain the test case.
I used the money type for now. If someone adds hash support for that,
we'll change it. I don't think this will change too rapidly, though.
--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/
Attachment | Content-Type | Size |
---|---|---|
0001-Add-more-tests-for-hashing-and-hash-based-plans.patch | text/plain | 23.7 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Hash support for row types |
Date: | 2020-11-17 19:33:40 |
Message-ID: | 934167.1605641620@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> I wrote a new patch to add a lot more tests around hash-based plans.
> This is intended to apply separately from the other patch, and the other
> patch would then "flip" some of the test cases.
OK, that addresses my concerns.
regards, tom lane
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Hash support for row types |
Date: | 2020-11-19 08:44:43 |
Message-ID: | 809f4dbb-53ca-7758-a932-3094226c6ca3@2ndquadrant.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-11-17 20:33, Tom Lane wrote:
> Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
>> I wrote a new patch to add a lot more tests around hash-based plans.
>> This is intended to apply separately from the other patch, and the other
>> patch would then "flip" some of the test cases.
>
> OK, that addresses my concerns.
Thanks. I have committed the tests and then subsequently the feature patch.
--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/