You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/TODO.detail/performance
+113-2Lines changed: 113 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -345,7 +345,7 @@ From owner-pgsql-hackers@hub.org Tue Oct 19 10:31:10 1999
345
345
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
346
346
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA29087
347
347
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:31:08 -0400 (EDT)
348
-
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.8 $) with ESMTP id KAA27535 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:19:47 -0400 (EDT)
348
+
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.9 $) with ESMTP id KAA27535 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:19:47 -0400 (EDT)
349
349
Received: from localhost (majordom@localhost)
350
350
by hub.org (8.9.3/8.9.3) with SMTP id KAA30328;
351
351
Tue, 19 Oct 1999 10:12:10 -0400 (EDT)
@@ -454,7 +454,7 @@ From owner-pgsql-hackers@hub.org Tue Oct 19 21:25:30 1999
454
454
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
455
455
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA28130
456
456
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:25:26 -0400 (EDT)
457
-
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.8 $) with ESMTP id VAA10512 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:15:28 -0400 (EDT)
457
+
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.9 $) with ESMTP id VAA10512 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:15:28 -0400 (EDT)
From pgsql-general-owner+M2497@hub.org Fri Jun 16 18:31:03 2000
1006
+
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
1007
+
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04165
1008
+
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:31:01 -0400 (EDT)
1009
+
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.9 $) with ESMTP id RAA13110 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:20:12 -0400 (EDT)
1010
+
Received: from hub.org (majordom@localhost [127.0.0.1])
1011
+
by hub.org (8.10.1/8.10.1) with SMTP id e5GLDaM14477;
1012
+
Fri, 16 Jun 2000 17:13:36 -0400 (EDT)
1013
+
Received: from home.dialix.com ([203.15.150.26])
1014
+
by hub.org (8.10.1/8.10.1) with ESMTP id e5GLCQM14064
1015
+
for <pgsql-general@postgresql.org>; Fri, 16 Jun 2000 17:12:27 -0400 (EDT)
1016
+
Received: from nemeton.com.au ([202.76.153.71])
1017
+
by home.dialix.com (8.9.3/8.9.3/JustNet) with SMTP id HAA95516
1018
+
for <pgsql-general@postgresql.org>; Sat, 17 Jun 2000 07:11:44 +1000 (EST)
1019
+
(envelope-from giles@nemeton.com.au)
1020
+
Received: (qmail 10213 invoked from network); 16 Jun 2000 09:52:29 -0000
1021
+
Received: from nemeton.com.au (203.8.3.17)
1022
+
by nemeton.com.au with SMTP; 16 Jun 2000 09:52:29 -0000
1023
+
To: Jurgen Defurne <defurnj@glo.be>
1024
+
cc: Mark Stier <kalium@gmx.de>,
1025
+
postgreSQL general mailing list <pgsql-general@postgresql.org>
1026
+
Subject: Re: [GENERAL] optimization by removing the file system layer?
1027
+
In-Reply-To: Message from Jurgen Defurne <defurnj@glo.be>
1028
+
of "Thu, 15 Jun 2000 20:26:57 +0200." <39491FF1.E1E583F8@glo.be>
1029
+
Date: Fri, 16 Jun 2000 19:52:28 +1000
1030
+
Message-ID: <10210.961149148@nemeton.com.au>
1031
+
From: Giles Lean <giles@nemeton.com.au>
1032
+
X-Mailing-List: pgsql-general@postgresql.org
1033
+
Precedence: bulk
1034
+
Sender: pgsql-general-owner@hub.org
1035
+
Status: OR
1036
+
1037
+
1038
+
1039
+
> I think that the Un*x filesystem is one of the reasons that large
1040
+
> database vendors rather use raw devices, than filesystem storage
1041
+
> files.
1042
+
1043
+
This used to be the preference, back in the late 80s and possibly
1044
+
early 90s. I'm seeing a preference toward using the filesystem now,
1045
+
possibly with some sort of async I/O and co-operation from the OS
1046
+
filesystem about interactions with the filesystem cache.
1047
+
1048
+
Performance preferences don't stand still. The hardware changes, the
1049
+
software changes, the volume of data changes, and different solutions
1050
+
become preferable.
1051
+
1052
+
> Using a raw device on the disk gives them the possibility to have
1053
+
> complete control over their files, indices and objects without being
1054
+
> bothered by the operating system.
1055
+
>
1056
+
> This speeds up things in several ways :
1057
+
> - the least possible OS intervention
1058
+
1059
+
Not that this is especially useful, necessarily. If the "raw" device
1060
+
is in fact managed by a logical volume manager doing mirroring onto
1061
+
some sort of storage array there is still plenty of OS code involved.
1062
+
1063
+
The cost of using a filesystem in addition may not be much if anything
1064
+
and of course a filesystem is considerably more flexible to
1065
+
administer (backup, move, change size, check integrity, etc.)
1066
+
1067
+
> - choose block sizes according to applications
1068
+
> - reducing fragmentation
1069
+
> - packing data in nearby cilinders
1070
+
1071
+
... but when this storage area is spread over multiple mechanisms in a
1072
+
smart storage array with write caching, you've no idea what is where
1073
+
anyway. Better to let the hardware or at least the OS manage this;
1074
+
there are so many levels of caching between a database and the
1075
+
magnetic media that working hard to influence layout is almost
1076
+
certainly a waste of time.
1077
+
1078
+
Kirk McKusick tells a lovely story that once upon a time it used to be
1079
+
sensible to check some registers on a particular disk controller to
1080
+
find out where the heads were when scheduling I/O. Needless to say,
1081
+
that is history now!
1082
+
1083
+
There's a considerable cost in complexity and code in using "raw"
1084
+
storage too, and it's not a one off cost: as the technologies change,
1085
+
the "fast" way to do things will change and the code will have to be
1086
+
updated to match. Better to leave this to the OS vendor where
1087
+
possible, and take advantage of the tuning they do.
1088
+
1089
+
> - Anyone other ideas -> the sky is the limit here
1090
+
1091
+
> It also aids portability, at least on platforms that have an
1092
+
> equivalent of a raw device.
1093
+
1094
+
I don't understand that claim. Not much is portable about raw
1095
+
devices, and they're typically not nearlly as well documented as the
1096
+
filesystem interfaces.
1097
+
1098
+
> It is also independent of the standard implemented Un*x filesystems,
1099
+
> for which you will have to pay extra if you want to take extra
1100
+
> measures against power loss.
1101
+
1102
+
Rather, it is worse. With a Unix filesystem you get quite defined
1103
+
semantics about what is written when.
1104
+
1105
+
> The problem with e.g. e2fs, is that it is not robust enough if a CPU
1106
+
> fails.
1107
+
1108
+
ext2fs doesn't even claim to have Unix filesystem semantics.
0 commit comments