This is the list of issues raised about the 9 Dec 2003 Last Call Working Draft of "Architecture of the World Wide Web". See also the annotated version of the specifiation, which includes issue text inline.
At its 4 Dec 2003 teleconference, the TAG resolved unanimously to advance "Architecture of the World Wide Web, First Edition" to Last Call status. Please send comments on the draft to the public mailing list public-webarch-comments@w3.org (archive).
For more information about the TAG, refer to the TAG Home Page.
Color key: error warning note
Id:Title | State | Type | Open actions | Ack. |
---|---|---|---|---|
hammond1 : Distinguish normative/informative references? | declined | editorial |
| No response to reviewer |
hammond2 : Editorial comments | no decision (raised) | editorial |
| |
harold1 : Title of URI uniqueness constraint | agreed | editorial |
| No reply from reviewer |
karr1 : What does "authority component" mean? | agreed | clarification |
| No reply from reviewer |
ducharme1 : Editorial comments | agreed | editorial |
| No response to reviewer |
worthington1 : Simplify the text and separate the W3C politics | declined | editorial |
| No response to reviewer |
booth1 : Definition of "Web Agent" | declined | editorial |
| Agreement |
booth2 : What rights does "URI ownership" confer? | agreed | clarification | Agreement | |
booth3 : 4.5.4: NS document as definitive source of info on namespace | agreed | clarification | Agreement | |
goodwin1 : Editorial suggestions | agreed | editorial | No response to reviewer | |
stickler1 : Editorial suggestions | no decision (raised) | editorial | ||
stickler2 : Sections 4.5.4, 5: Namespace document | declined | error |
| No reply from reviewer |
stickler3 : Section 5: Dereference a URI | declined | editorial |
| No response to reviewer |
stickler4 : Section 3.6.1 Proposed removal of good practice note | declined | request |
| No reply from reviewer |
stickler5 : Section 3.6, para 1: Fix "resource is unreliable" | agreed | error |
| No response to reviewer |
stickler6 : Section 3.5.1: POST requests and URIs | agreed | error |
| No reply from reviewer |
stickler7 : Section 3.4, para 2: URI ownership questions | no decision (raised) | error | ||
stickler8 : Section 3.3.1, last para, last sentence: Nature of secondary resource not known through URI | no decision (raised) | clarification |
| |
stickler9 : Good practice note on URIs without fragids? | subsumed | proposal |
| |
stickler10 : Section 5: Secondary resource | no decision (raised) | editorial |
| |
pps1 : Ownership and authority | agreed | error | No response to reviewer | |
hawke1 : Proposed good practice note on looking inside protocol interactions | subsumed | proposal |
| |
hawke2 : Section 2: Full agreement not required for communication | subsumed | error |
| |
hawke3 : 2.2 UUID/MD5 not registered URI schemes | no decision (raised) | editorial |
| |
hawke4 : 2.3: Propose "URI overloading" not "URI ambiguity" | agreed | editorial | No response to reviewer | |
hawke6 : 2.6. Fragment Identifiers: "Stem resource" | no decision (raised) | editorial | ||
hawke7 : 2.7.2. Assertion that Two URIs Identify the Same Resource | subsumed | proposal |
| |
hawke8 : 3.2. Messages and Representations: Use of "state" | no decision (raised) | editorial | ||
hawke9 : 3.2. Messages and Representations | no decision (raised) | editorial |
| |
hawke10 : 3.5. Safe Interactions | declined | editorial | No response to reviewer | |
dhm1 : 1.1.3. Principles, constraints and good practices | no decision (raised) | editorial |
| |
dhm2 : "Silent recovery from error is harmful" | no decision (raised) | editorial |
| |
dhm3 : "Language extension" definition | agreed | editorial |
| No reply from reviewer |
dhm4 : 1.2.4 Protocol based interoperability | no decision (raised) | editorial |
| |
dhm5 : 4.1 Prevalence of Unicode | no decision (raised) | editorial |
| |
dhm6 : Use of "server" in "...authoritative server metadata..." | no decision (raised) | clarification |
| |
dhm7 : Webarch conformance model, subjects of GPNs | no decision (raised) | clarification |
| |
weitzner1 : Proposed summary format | no decision (raised) | proposal |
| |
rodriguez1 : Dataweb?: XDI and XRI. | declined | request | No reply from reviewer | |
kopecky1 : Proposed to drop stories | no decision (raised) | editorial |
| |
kopecky2 : 3.1 Reference or Identify? | declined | clarification |
| No response to reviewer |
kopecky3 : 4 application/xml | no decision (raised) | clarification |
| |
kopecky4 : 4.5.3 use of "understand" | no decision (raised) | clarification |
| |
kopecky5 : 4.5.5 More info on qnames, fragids, ns docs | subsumed | request |
| |
kopecky6 : 4.5.6 What's the conclusion? | subsumed | request | ||
duerst1 : Principle and constraint titles | no decision (raised) | editorial |
| |
duerst2 : WebArch and RFC2396bis: URIs and Fragids | no decision (raised) | editorial |
| |
diwg1 : Add scenario(s) with dynamically generated URI | no decision (raised) | proposal |
| |
diwg2 : Don't communicate language info in URIs (in example) | declined | error |
| No reply from reviewer |
diwg3 : Suggest discussion of accessing different representations (transformed) of the same resource | no decision (raised) | proposal |
| |
diwg4 : Suggest discussion of the limitations of Internet media types as the prime mechanism for selecting between different representations of a resource. | agreed | proposal |
| No reply from reviewer |
clark1a : Fragment Identifier Semantics | agreed | clarification |
| No reply from reviewer |
clark1b : Conflicting secondary resources | no decision (raised) | clarification | ||
clark2 : What kinds of ambiguity are there? | no decision (raised) | clarification | ||
clark3 : Willy-Nilly Resource Change | declined | clarification |
| No reply from reviewer |
clark4a : Hypertext Good Practice Redundancies | declined | clarification | No reply from reviewer | |
clark4b : "Expected UI Paradigm"? | no decision (raised) | clarification |
| |
clark5 : Silent Error Recovery Always Harmful? | agreed | error |
| No reply from reviewer |
clark6 : Separating Presentation From Content | no decision (raised) | clarification |
| |
clark7 : More Ambiguity | no decision (raised) | clarification |
| |
clark8 : Section 3.4.'s Unmotivated Paragraph | no decision (raised) | clarification |
| |
clark9 : "Safe" and "Unsafe" Interactions | no decision (raised) | editorial | ||
clark11 : The "great power" of URIs and their "vastness of choice" | no decision (raised) | editorial |
| |
clark12 : Needless Propagation of URIs? | agreed | error |
| No response to reviewer |
laskey1 : Editorial comments on WebArch | no decision (raised) | editorial |
| |
laskey2 : What determines URI uniqueness? | no decision (raised) | clarification |
| |
gilman1 : 'legal requirement' as justification for 'particular presentation' misses 'leading Web to highest' mark | agreed | error |
| No response to reviewer |
gilman2 : orthogonality is not the answer | declined | error |
| No reply from reviewer |
lesch1 : Editorial comments | no decision (raised) | editorial |
| |
schema1 : [1.2.3] "Silent recovery from error is harmful." | subsumed | error |
| |
schema2 : [Section 2] Unwise confluence of identification and retrievability | agreed | error | No response to reviewer | |
schema3 : [Section 2.3] Clarity required on nature of "resource" | agreed | error | No response to reviewer | |
schema4 : [3.3 Good practice: Fragment Identifier Consistency] | subsumed | clarification |
| |
schema5 : [3.3.1] Inconsistency with RFC2396bis about frag id meaning? | agreed | error | No response to reviewer | |
schema6 : [3.3.1] Do fragment identifiers work only with media-typed representations? | no decision (raised) | clarification | ||
schema7 : [3.4.1] What is scope of metadata? | no decision (raised) | clarification | ||
schema8 : [3.4.1] Authority and trust | subsumed | error | ||
schema9 : [3.4.1] Are peer-to-peer interactions covered? | subsumed | error | ||
schema10 : [3.5] Breadth of "safe" interactions | no decision (raised) | error | ||
schema11 : [3.5.1] Best practice that content-location SHOULD be used? | no decision (raised) | clarification | ||
schema12 : [3.6.1] [3.6.1] Good practice: Available representation. Too preferential to dereferencable URIs | agreed | error | No response to reviewer | |
schema13 : [4.2] Overly simplifies a complex problem | agreed | error |
| No reply from reviewer |
schema14 : [4.2.3] Must * rules in instance v. documentation | no decision (raised) | clarification |
| |
schema15 : [4.2.4] SOAP message cannot include JPEG | declined | error | Agreement | |
schema16 : [4.5.1] Section on when to use XML formats underdeveloped | agreed | editorial | No response to reviewer | |
schema17 : [4.5.3] Statement about XMLNS and unique names false | agreed | error |
| No reply from reviewer |
schema18 : [4.5.3] Clarification on "type" in XML Schema | agreed | clarification |
| Agreement |
schema19 : [4.5.3] Element type/instance confusion | agreed | clarification |
| Agreement |
schema20 : [4.5.6] Flavors of ID not discussed | agreed | error |
| No reply from reviewer |
schema21 : General editorial comments | no decision (raised) | editorial |
| |
lafon1 : Implications of HTTP URI implying GET as default method | no decision (raised) | editorial | ||
msm1 : editorial comments on Web Architecture document | no decision (raised) | editorial | ||
msm2 : WD-webarch-20031209, 1.1.3: Self-descriptive markup considered improbable | declined | error | No response to reviewer | |
msm3 : WD-webarch-20031209, 1.2.1 para 1: Assigning identifiers without knowing about representations | subsumed | error |
| |
msm4 : WD-webarch-20031209, 1.2.1, final bulleted list, final item.: Authoritative metadata and the principle of decentralization | agreed | error | No response to reviewer | |
msm5 : WD-webarch-20031209, 1.2.2 para 5: Extensibility is a not a property of languages in isolation | no decision (raised) | error | ||
msm6 : WD-webarch-20031209, 1.2.2 para 5: Ignoring the unknown as a default action | no decision (raised) | error | ||
msm7 : WD-webarch-20031209, 1.2.2 para 6: Ignoring elements and ignoring tags | agreed | editorial | No response to reviewer | |
msm8 : WD-webarch-20031209, Section 2, introductory paragraphs: The term 'resource' needs to be defined | no decision (raised) | error | ||
msm9 : WD-webarch-20031209, Section 2 para 3: The vastness of URI space | subsumed | error |
| |
msm10 : WD-webarch-20031209, Section 2: Assigning URIs to resources others will expect to refer to | subsumed | error |
| |
msm11 : WD-webarch-20031209, Section 2.2, bulleted list, first item: Delegation of authority in hierarchical URIs | subsumed | error | ||
msm12 : WD-webarch-20031209, Section 3.3.1, para 1: Are there constraints on the interpretation of fragment identifiers? | no decision (raised) | error | ||
msm13 : WD-webarch-20031209, Section 3.3.2, para 3: Consistency of fragment identifiers | agreed | error |
| No reply from reviewer |
msm14 : WD-webarch-20031209, Section 4.2.2, Story: Allowing extra attributes does change the conformance of existing data | declined | error | No reply from reviewer | |
baker1 : Independence between identifier and resource, or representations? | subsumed | clarification | ||
baker2 : More info on non-browser Web | subsumed | request | ||
baker3 : Editorial comments | no decision (raised) | editorial | ||
baker4 : 4.5.2: Preference for RDF linking over XLink linking | no decision (raised) | error | ||
parsia1 : LC Comments: 1.2.1, editorial | no decision (raised) | editorial | ||
parsia2 : LC Comment 1.3.1, editorial | no decision (raised) | editorial | ||
parsia3 : LC Comment, 1.2.3: Principle: Error recovery | no decision (raised) | clarification | ||
parsia4 : LC Comments, 1.2.4, editorial | no decision (raised) | editorial | ||
parsia5 : LC Comment, Section 2: Agreement on identifiers | subsumed | error | ||
parsia6 : LC Comment, Section 2: Identification mechanism of the Web | subsumed | error | ||
parsia7 : LC Comment, Section 2: On requirement to assign a URI to a resource | subsumed | error |
| |
parsia8 : LC Comment, Section 2: On resources existing before URIs | no decision (raised) | clarification | ||
parsia9 : LC Comment, Section 2: On resources being able to have zero URIs | subsumed | error | ||
parsia10 : LC Comment, Section 2: On URI assignment | subsumed | error | ||
parsia11 : URI assignment v. use. Who are URI producers? | agreed | editorial | No response to reviewer | |
parsia12 : Ambiguous use of URIs v. URI Ambiguity? | subsumed | error | ||
parsia13 : Use of term "URI Space" | no decision (raised) | clarification | ||
parsia14 : Various types of ownership | no decision (raised) | clarification | ||
parsia15 : Social implications of URI ownership. | no decision (raised) | error | ||
parsia16 : No conformance section? Guidance on usage then? | no decision (raised) | clarification | ||
parsia17 : Do you mean resource or representation? | no decision (raised) | clarification | ||
parsia18 : Temporal URL ambiguity useful for Web robustness? | no decision (raised) | clarification | ||
parsia19 : Ok to infer properties of retrieved representations? | no decision (raised) | clarification | ||
parsia20 : Drop definition of "on the Web" | subsumed | proposal | ||
parsia21 : Drop sentence on successful communication | subsumed | proposal | ||
parsia22 : What does "in general" mean? Would the case be different "in specific"? | no decision (raised) | clarification | ||
nottingham1 : Second bullet doesn't make sense. | agreed | error | No response to reviewer | |
nottingham2 : Include reference to IANA Registry of HTTP headers? | no decision (raised) | editorial | ||
klyne1 : Proposed to drop para on view source or clarify role in webarch | no decision (raised) | clarification | ||
klyne2 : Change "other operations" to "refer to in another way" | no decision (raised) | editorial | ||
klyne3 : Proposal to improve text about "network effect" | no decision (raised) | editorial | ||
klyne4 : Proposed rewrite of overlapping paras | no decision (raised) | editorial | ||
klyne5 : Proposed to use "global across the Web" rather than "global" | no decision (raised) | editorial | ||
klyne6 : Clarification about point on agents detecting equivalence relationships | no decision (raised) | clarification | ||
klyne7 : Use other schema than mailto as example | agreed | editorial | No response to reviewer | |
klyne8 : Unclear point about ambiguity in natural language; is the point about machine processing? | no decision (raised) | clarification | ||
klyne9 : Add stronger language on not permitting unregistered URI schemes. | agreed | editorial | No response to reviewer | |
klyne10 : Add cross-ref to section on orthogonality | no decision (raised) | editorial | ||
klyne11 : Change "will result" to "will necessarily result" | agreed | clarification |
| No response to reviewer |
klyne12 : Proposal to drop paragraph on inconsistent frag ids | subsumed | error |
| |
klyne12b : Drop "by design" or replace with "by intent" | no decision (raised) | editorial | ||
klyne13 : Text on communication between two parties misses mark about global names | no decision (raised) | clarification | ||
klyne14 : Managers of resource, not Oaxaca | no decision (raised) | editorial | ||
klyne15 : Lack of separation between owner of a resource and authority for a part of URI space used to identify a resource? | agreed | error |
| No reply from reviewer |
klyne15b : Propose "rationally" instead of "predictably" | no decision (raised) | editorial | ||
klyne16 : Proposed improved example about using content negotiation | no decision (raised) | editorial | ||
klyne17 : Worth pointing out value of RDF descriptions depends on URI persistence? | agreed | proposal | No response to reviewer | |
klyne18 : Use one of "data format", "format". And "language"? | no decision (raised) | editorial | ||
klyne19 : Unclear statement about mixing RDF vocabularies | agreed | clarification |
| No reply from reviewer |
klyne20 : Say something about relationship between Hypertext Web and Semantic Web? | subsumed | proposal |
| |
klyne21 : Add statement about scalability concerns | agreed | proposal |
| No response to reviewer |
klyne22 : Clarify what is meant by context having influence on use of hyperlinks | no decision (raised) | clarification | ||
klyne23 : Clarify section (see TBL text on identifier v. reference?) | no decision (raised) | clarification | ||
klyne24 : Is Web apart from Internet? | no decision (raised) | clarification | ||
klyne25 : Add reference to RFC3117, section 5.1? | agreed | proposal | No response to reviewer | |
klyne26 : Transcoding allowing by some or all intermediaries? | no decision (raised) | clarification | ||
klyne27 : Clarify para on "text/" and US-ASCII encoding. How does it relate to following GPN? | no decision (raised) | editorial | ||
manola1 : Use of "agent" as people+software is flaky throughout document. | no decision (raised) | editorial | ||
manola2 : Clarify nature of resource in example | no decision (raised) | editorial | ||
manola3 : Minor editorial comments | no decision (raised) | editorial | ||
manola4 : Sentence about refs ends abruptly | no decision (raised) | editorial | ||
manola5 : Sentence on understanding REST model unclear | no decision (raised) | clarification | ||
manola6 : User agent any kind of agent or just software? What does "on behalf of" include? | no decision (raised) | clarification | ||
manola7 : "Agent" or "user agent" meant? | no decision (raised) | clarification | ||
manola8 : Add "(names for things)" after "identifiers"? | no decision (raised) | clarification | ||
manola9 : Why use "third party"? Who are other two? | no decision (raised) | editorial | ||
manola10 : Who are "designers"? Any diff between URI owner/producer? Relationship to resource owner? | no decision (raised) | editorial | ||
manola11 : Proposed improvement to GPN on URI assignment | no decision (raised) | editorial | ||
manola12 : URI producers or owners? Relationship to opacity principle? Evidence of confusion about "agent" including "people"? | no decision (raised) | clarification | ||
manola13 : Can agents assign URIs? Or should this be "use"? | no decision (raised) | clarification | ||
manola14 : Clarify relationship between resource / URI ownership | no decision (raised) | clarification | ||
manola15 : Does example *also* illustrate ambiguous URI usage? | no decision (raised) | clarification | ||
manola16 : Paragraph on other uses of URIs is confusing | no decision (raised) | clarification | ||
manola17 : "Agent" that includes "people" source of confusion | subsumed | error | ||
manola18 : Update references to RDF, OWL specs | no decision (raised) | editorial | ||
manola19 : Please provide qualifying context about the nature of the Web | agreed | clarification |
| No reply from reviewer |
manola20 : Have text after a story answer the question in the story. | no decision (raised) | editorial | ||
manola21 : Owner of resource v. owner of URI | no decision (raised) | clarification | ||
manola22 : "Agent" or "user agent" meant? | no decision (raised) | clarification | ||
manola23 : Can software agents incur obligations? ("agent" or "user agent") | no decision (raised) | clarification | ||
manola24 : What meaning(s) of "order" is meant? | no decision (raised) | clarification | ||
manola25 : Agents "do not" or "should not" incur obligations? | no decision (raised) | clarification | ||
manola26 : What does last sentence of story have to do with story? | no decision (raised) | editorial | ||
manola27 : Provide examples of mistaken attempts to restrict URI usage | no decision (raised) | proposal | ||
manola28 : Another case of "agent includes people?" doubt | no decision (raised) | editorial | ||
manola29 : What are "language instances"? | agreed | clarification |
| No reply from reviewer |
manola30 : Difference between "setting expectations" and "specifying"? | agreed | clarification |
| No reply from reviewer |
manola31 : Questions about RDF, text, XML mixing | agreed | clarification | No reply from reviewer | |
manola32 : Reword to avoid rhetorical question | no decision (raised) | editorial | ||
qawg1 : Seeking liaison on definitions of extensibility w.r.t. QA Guidelines | no decision (raised) | clarification | ||
i18nwg1 : Use "language" for natural language and "format" for format | no decision (raised) | editorial | ||
i18nwg2 : Use "language" for natural language and "format" for format | no decision (raised) | editorial | ||
i18nwg2b : Oaxaca hard to pronounce; propose Lima | no decision (raised) | editorial | ||
i18nwg3 : Please show charset in Content-type. | no decision (raised) | editorial | ||
i18nwg4 : Please refer to "issues" rather than "limitations" | no decision (raised) | clarification | ||
i18nwg5 : Discussion of content-type header hint | no decision (raised) | error | ||
i18nwg6 : Say something about character encoding/labeling errors. | no decision (raised) | editorial | ||
i18nwg7 : Mention language negotiation | no decision (raised) | editorial | ||
i18nwg8 : Sentences seem contradictory | subsumed | error |
| |
i18nwg9 : Case example unclear. | no decision (raised) | clarification | ||
i18nwg10 : Don't recommend organizing information by language. | no decision (raised) | clarification |
| |
i18nwg11 : Mention IRIs? | no decision (raised) | clarification | ||
i18nwg12 : Clarification on reference to "character" | no decision (raised) | editorial | ||
i18nwg13 : URI ambiguity ambiguous | no decision (raised) | editorial | ||
i18nwg14 : Show examples of good and bad ambiguity | no decision (raised) | clarification | ||
i18nwg15 : Missing word | no decision (raised) | editorial | ||
i18nwg16 : Good practice on URI opacity impossible to follow for humans. | subsumed | error | ||
i18nwg17 : Add mention of IRIs | no decision (raised) | editorial | ||
i18nwg18 : Mention that editing tools may be more strict than simple user agents | no decision (raised) | editorial | ||
i18nwg19 : text/foo+xml considered useless? | subsumed | proposal | ||
i18nwg20 : text/foo+xml considered useless? | no decision (raised) | proposal | ||
rdfcore1 : RDF Core general comments | no decision (raised) | editorial | ||
dubinko1 : Document different interpretations around httpRange-14 | no decision (raised) | editorial | ||
dubinko2 : Architecture v. Building Codes | no decision (raised) | editorial | ||
falstrom1 : Editorial comments | no decision (raised) | editorial | ||
falstrom2 : SOAP as a different thing than the other protocols | no decision (raised) | clarification | ||
falstrom3 : Separate Media Types and Frag Id discussion | no decision (raised) | editorial | ||
falstrom4 : Show only good practice re: Content-Location | no decision (raised) | editorial | ||
falstrom5 : Add discussion about SSL/TLS? Is title correct? | no decision (raised) | editorial | ||
falstrom6 : List overall diffs between binary and text formats | no decision (raised) | editorial | ||
falstrom7 : Indicate when to use different link types | no decision (raised) | editorial | ||
rosenberg1 : Introductory para on Web overly broad? | no decision (raised) | editorial | ||
rosenberg2 : RFC2119 terms meant for protocols | no decision (raised) | editorial | ||
rosenberg3 : Reuse appropriate URI schemes (and protocols) | subsumed | proposal | ||
rosenberg4 : Use SIP for voice-over-ip, RTSP for streaming media | no decision (raised) | clarification | ||
rosenberg5 : Proposed reference to IANA registry for namespaces and RFC 3688 | agreed | editorial | No response to reviewer |
Would it be appropriate to distingush between normative and non-normative (i.e. informative) references?
The TAG does not feel an informative/normative split is justified.
Send TAG a draft of a response to Hammond review in light of TAG's discussion.
The reviewer made a number of editorial suggestions.
Propose editorial response.
Implemented:
Did not implement: "Sect 4 is entitled 'Data Formats', and Sect 1, 3rd bullet has 'Formats'. Would suggest that both should be changed to 'Representation' in keeping with the 3 bases articulated in the Abstract (identification, interaction, representation). This shift in gears from representation to data formats is potentially confusing. Maybe within the section one could talk of data formats (as a more concrete realization of the abstraction 'representation'), but I think the section (and bullet) are better labeled at the more generic/abstract level." Rationale: We used to have that and then chose the current organization instead."
Did not implement: "Almost all the Story examples seem to make use of HTTP URIs. Any chance of sneaking in some other URI schemes just here and there just to reinforce that the fact that this is a democrarcy not a monarchy? Perhaps even just a mailto, or urn, or something more exotic?" Rationale: "We have examples of other schemes. No need to use exotic schemes if not motivated by story.
Did not implement: "Sect 2.4, 3rd para, 1st sentence, 'While the Web architecture...' - change 'is costly' to 'can be costly'?". Rationale: Not sure about this change.
Did not implement: "Sect 2.4, 3rd para, 3rd sentence, 'Introducing a new URI scheme...' - change 'requires' to 'may require'?" Rationale: Not sure about this change.
Did not implement: "Sect 2.4, last para, last sentence - 'When an agent does not handle a new URI scheme, it cannot retrieve a representation.' This seems prejudicial, as if the only intersting operations are retrieval. An agent can already make use of the identitiy afforded by a URI and comparison of URIs in applications such as merging of RDF graphs or of merging Topic Maps which identify resources by means of URIs." Rationale: Nonetheless, the statement is true.
Did not implement: "Sect 3, last para ('Note') before Sect 3.1. I would strongly query the sentence 'Informally, a resource is "on the Web" when it has a URI and an agent can use the URI to retrieve a representation of it ...'. I would rather say that a resource is "on the Web" when it is referenced by means of a URI. That would seem to me to be a full and sufficient condition. A resource referenced by a URI participates within the Web information space and assertions can be made about that resource." Rationale: The TAG did not agree to that definition.
Did not implement: "Sect 3.6.2, 1st para. Should clarify here that 'URI persistence' actualy refers to persistence of the referenced resource, not to the URI. (That point is made in the [Cool] reference entry but should be made here and not in the refrence section.)" Rationale: Having reread the sentence, I don't believe that's necessary. It's defined clearly.
Did not implement: "Sect 4.5.5, 1st para, 2nd sentence. 'A qualified name is a pair consisting of a URI,..., and a local name...' Surely the qualified name itself consists of a 'prefix' which represents the URI (i.e. is a URI placeholder), and a local name?" Rationale: I think that's a qname rather than a qualified name.
"Constraint: URI uniqueness" is defined:
Web architecture does not constrain a Web resource to be identified by a single URI.
The constraint and the title do not seem to match. Perhaps the title is supposed to be "URI non-uniqueness" or perhaps the text is supposed to be something like "Each URI identifies exactly one resource". However, the title suggests to me that URIs are unique, and the text suggests the opposite.
At their 12 May 2004 TAG teleconference, the TAG resolved to demote the first constraint of section 2.1 to a sentence.
Propose editorial response.
In section 2.1, changed title of constraint to URI multiplicity.
Incorporate TAG resolution.
Deleted the "URI multiplicity" constraint from the beginning of 2.1. This text was moved (as an ordinary sentence) to the new subsection: URI/Resource Relationships.
In section 2.1, "URI Comparisons", I understand the meaning of the paragraph which begins "Applications may apply rules ...". It means that if your application makes assumptions about URI equivalences based on details not covered in the specification, then it's your responsibility if any problems develop from that. What I don't understand is the term "authority component" in this sentence:
For example, for "http" URIs, the authority component is case-insensitive.
The TAG agreed with the Editor's change to include parenthetical.
Propose editorial response.
In section 2.1, included a parenthetical explanation of what the generic "authority component" is.
The TAG agreed with the Editor's change to include parenthetical.
Ask reviewer if satisfied.
The reviewer made a number of editorial suggestions.
The Editor will take these comments into account.
Propose editorial response.
Incorporated all of reviewer's suggestions.
Split document in two: architecture, rationale
The TAG believes the document has an appropriate quantity/level of examples. However, the reviewer has said: "Thanks, I am satisfied that you have given my comment serious consideration. But as is I don't see the document as being workable, so I will not be recommending use of the Architecture to my students, colleagues or clients."
Respond to Tom Worthington, talking about arch doc / findings balance, and pointing out that we are not creating a point-form architectural thesis.
[Empty]
[Empty]
Should not include people
The TAG defends its definition of "agent" as including people.
Does not object to the definition
Respond to DB on TAG's choice of agent - the status quo.
Following the lessons of the "deep linking" debacle, it might be good to say explicitly what rights "URI ownership" does or does not confer. This is somewhat addressed later, but it might be good to say something in this section.
The Editor will include a forward link from 2.2 to 3.6.3.
Satisfied with editorial change in 10 May 2004 draft.
Include forward reference.
In section 2.2, included a forward reference to section on URI ownership.
Incorporated in 10 May 2004 draft.
However, the term "definitive" is missing. Was this intentional? Based on a quick skimming of the issue, it looks like the TAG is in agreement that the namespace document should directly or indirectly provide *definitive* material about the namespace, but I'm not sure.
The TAG agrees but for consistency prefers the term "authoritative".
Satisfied with editorial change in 10 May 2004 draft.
Refer to provider's namespace information as "authoritative".
In section 4.5.4, edited good practice note to say that: "When a namespace representation is provided by the namespace URI owner, that material is authoritative.
Also, globally changed the term "resource owner" to "uri owner" and clarified usage of "authority responsible for domain X" in the document.
The TAG accepted the proposal.
The reviewer made a number of useful editorial suggestions.
The Editor will take these comments into account.
Account for these comments.
Took into account all of the reviewer's suggestions. In particular, reorganized section 2.1 to read more clearly. Created a new subsection (2.1.1) out of the second half.
Incorporated in 10 May 2004 draft.
The reviewer made a number of useful editorial suggestions. This issue addresses those editorial points in the initial email not covered by other issues.
Incorporated editorial suggestions.
Incorporated in 10 May 2004 draft.
.. It is incorrect to suggest that there is any semantic relation between the meaning of a URI used as a namespace name and the meaning of terms grounded in that namespace...
Strongly advise the removal of both this term from the publication entirely but particularly this incorrect definition (see discussion above). The assertion that every URI used as a namespace name denotes a namespace document is false.
Adopt the following definition as a substitute for namespace document: "If a namespace declaration binds a prefix to a URI, and that URI can be dereferenced to get a representation, then that is a namespace representation."
Inform reviewer of TAG's decision.
Incorporated editorial suggestions.
Substitute "namespace representation" for "namespace document". The TAG discussed this edit briefly at their 12 May 2004 ftf meeting. There was no resolution, but the editor suspects that the change will be undone in subsequent drafts along with other changes regarding "information resources".
New proposal
First two paras edited so description of XML Namespaces has changed.
Consider expanding to "Indirectly access the resource identified by the URI via a representation of that resource."
The definition makes sense in the full context of the document; the TAG recommended no change to the glossary.
Consider change
The Editor considered and did not adopt the proposed change for the definition of "dereference a URI": "Indirectly access the resource identified by the URI via a representation of that resource."
The TAG agreed with the editor's choice not to change the glossary entry.
Ask the commenter if the definition in context (in section 3.1) explains the way we use the terms to his satisfaction.
[Empty]
Owners of URIs should be free to decide whether any representations are made available, and should *NOT* feel obligated to provide representations if they themselves have no need to do so. URIs without representations may simply be less valueable/useful than those with representations. But it shouldn't be considered bad practice to not provide any representations. I recommend that this particular "good practice" be removed, even though language should remain which reflects that URIs with accessible representations are usually more useful than those without.
The TAG intended to indicate that people SHOULD provide representations; the community is poorer where representations are not available. "SHOULD" allows URI owners to make a choice.
Inform reviewer of TAG's decision.
[Empty]
How can "...they both conclude that the resource is unreliable" since (a) they cannot determine from either the URI or any representation what resource the URI actually denotes, and (b) the behavior of a given server providing access to representations of a resource is all that can be unreliable. The resource itself is (typically) not part of the system. A better example of "unreliability" might be a service which frequently returns 404 responses rather than useful representations or one which often returns representations which do not accurately reflect the state of the weather in Oaxaca, or one which sometimes returns XHTML but other times returns plain text. Yet in such cases, it is the service resolving the URI to representations that is unreliable or inconsistent, not the resource itself.
The Editor will s/unreliable/unpredictable.
Inform reviewer of TAG's decision.
Changed "unreliable" to "unpredictable" in 3.6 story. At their 13 May 2004 ftf meeting, the TAG discussed the use of both terms (unreliable and unpredictable) but did not come to a clear (revised) resolution.
Updated proposal
404 example included; both "reliable" and "unpredictable" used.
Para 3 seems to contradict the last statement of para 1. In para 1 it is said that POST requests and responses cannot be referenced by URIs, yet para 3 describes a means to do just that. It seems that what is meant to be said in para 1 is that, per the default behavior of POST, the request and response are not normally assigned distinct URIs by which they can be later referenced. ???
The Editor will review 3.5.1 and propose a revision to the TAG that more clearly distinguishes the two topics of bookmarking results of POST and paper trails (both safe and unsafe contexts).
Inform reviewer of TAG's decision.
Created subsection 3.5.1 to distinguish topics of safe interaction and paper trail.
The TAG asked the Editor to edit this section to say that:
The reviewer raised a number of points about URI ownership and authority in sections 3.4 para 1 and para 2.
Propose editorial response.
Edited first and second paragraphs of section 3.4 in a way which I believe is consistent with the reviewer's comments.
Incorporated in 10 May 2004 draft.
This sentence seems misleading, as if one can infer something about the nature of a secondary resource by interpreting a URI reference with fragement identifier. One cannot infer the nature of any URI denoted resource based either on the URI *or* based on any representation obtained by dereferencing that URI, either directly, or for URI references with fragment identifiers, by first dereferencing the base URI and interpreting the fragment in terms of the MIME type of the returned represenatation. This last sentence could either be removed or clarified/reworked.
Propose to the TAG a reponse to P. Stickler's message.
[Empty]
Question: are the methods PUT, POST or DELETE meaningful for URI references with fragment identifiers, in terms of interacting with the state of the secondary resources denoted? If not, then it seems there is a good principle that one should use URIs without fragment identifiers whenever possible to maximise the utililty of those URIs.
Overtaken by events.
Respond to reviewer's comment that HTTP PUT/POST/DELETE do not work with URIs with fragment identifiers since HTTP does not give access to the secondary resource.
In section 3.3.1, included "Note also that since dereferencing a URI (e.g., using HTTP) does not involve sending a fragment identifier to a server or other agent, certain access methods (e.g., HTTP PUT, POST, and DELETE) cannot be used to interact with secondary resources."
At their 13 May 2004 ftf, the TAG rejected the proposed text and asked the Editor to remove the sentence from the document.
This definition is difficult to read and seems to be gramatically ill formed. It should be reworked a bit. Perhaps "A resource that is indicated by a fragment identifier component of a URI reference, which must be interpreted in terms of a representation obtained by dereferencing the base URI of the URI reference along with the media type of that representation". ???
Fix grammatical error in definition of "secondary resource" in section 5.
Fixed grammatical error.
Given all these problems I don't see how the architectural principles of the World Wide Web can be so dependent on resource ownership. Many of the uses of ``resource owner'' in the document do not make sense at all and need to be removed from the document.
The term "Resource owner" has been replaced with "URI owner".
Propose editorial response.
Globally replaced "resource owner" with "URI owner".
Incorporated in 10 May 2004 draft.
Good practice: user agents should allow user to look "inside" to see (and even manipulate) the protocol interactions the agent is performing on behalf of the user.
Overtaken by events.
Propose text for this issue. There is general support for a visibility principle.
Parties who wish to communicate must agree upon a shared set of identifiers and on their meanings.
This is untrue for some reasonable meanings of "meaning", as Pat Hayes has argued from time to time. You could say instead: "Parties who wish to communicate must agree on the practical effects of using certain identifiers." or "Parties who wish to communicate must agree upon a shared set of identifiers and (to a reasonable degree) on their meanings."
That is: some ambiguity of meaning is both reasonable and unavoidable. I don't think an unqualified "agree" normally means "partially agree".
Does http://weather.example.com/oaxaca identify the weather report for just Oaxaca or for the Oaxaca region? When it starts to matter, you can start to build a shared understanding of which it is. But you can't banish those ambiguities until you notice them. There's also a school of design where you choose not to banish them, even when you see them, until you know they matter.
Overtaken by events.
Propose editorial response.
Agreed with reviewer; fixed text at beginning of section 2.
They are not on IANA's list. I pay some attention, and I'm not aware of a stable specification for either one. The spec on DanC's list for UUID has long since expired; the reference for MD5 is simply to a hypothetical use of it.
Propose editorial response.
Agreed with reviewer; fixed text in bulleted list of section 2.3.
Incorporated in 10 May 2004 draft.
Remove the middle bullet from 2.3.
No more large number discussion. Left in mid/cid as examples where there is hierarchical delegation. Note the rationale for establishing uri/social entity relationship: "It is useful for a URI scheme to...". Not sure if that is sufficient....
Well then use a different word! Please! Call it "URI Overloading" and let "URI Ambiguity" be used for the unavoidable and quite acceptable situation I talked about in my Comment 8.
The TAG accepted the global change of "URI ambiguity" to "URI Overloading".
Propose editorial response.
Agreed with reviewer; changed "ambiguity" to "overloading" globally. Also, in section 2.2, replaced Moby Dick example with movie example. Also changed the end of the paragraph in section 2.2.1 similarly.
Incorporated in 10 May 2004 draft.
I would suggest instead [of "secondary resource"] that you: (1) Name the the portion of the URI up the the "#". TimBL has called this the "racine", but I like "stem", "trunk", or maybe even "non-fragment portion". (2) Call the resource identified by a URI's stem the "stem resource", or something like that.
Proposal" Knowing two URIs identify the same resource does not, however, mean they are interchangeable. For example, Oaxaca might have several government-run weather stations, and the measurements take from each of these might be available from both weather.example.org and weather.example.com. The first might call a particular station http://weather.example.org/stations/oaxaca#ws17a while the second calls it http://weather.example.com/rdfdump?region=oaxaca&station=ws17a These two URIs would both identify the same resource, a certain collection of weather measuring equipment. They are owl:sameAs each other. But an attempt to dereference them might well produce different content produced by different organizations (probably based originally on the same government-supplied data), so a user agent which substituted one for the other would be serving its user poorly.
Overtaken by events.
Propose editorial response.
Adopted reviewer's suggested text in section 2.7.2 (third para). However, after 7 June 2004 teleconf, deleted the proposed paragraph.
Does "state" really mean anything there? Is there a difference between "data about Ian" and "data about the state of Ian"? Maybe this could be clarified with: Note: the phrases "representation of the state of the resource" and "representation of the resource" mean essentially the same thing; the term "state" is sometimes used to help convey that resources and thus their representations often vary over time.
Propose editorial response.
Adopted reviewer's suggestion in some cases to eliminate redundancy, but left "resource state" when it was useful and not verbose.
Incorporated in 10 May 2004 draft.
s/electronic data/data
Propose editorial response.
Adopted reviewer's suggestion to change "electronic data" to "data" in section 3.2.
Choice of word "safe" in "safe operation" and relation to "dangerous"
The TAG agreed with the Editor's choice to retain the words "safe" and "unsafe".
Propose editorial response.
In section 3.5, added explanatory note at end of second paragraph. However, kept terms "safe" and "unsafe" since they are used in the RFC.
Incorporated in 10 May 2004 draft.
The document defines design choices and properties but never uses them. Are they needed at all, then? The document has 2 constraints, but its not clear they fit in the given definition. (See related comments from Martin Dürst about the terms principle/constraint/etc.
Propose editorial response.
Eliminated unused concepts and reduced section 1.1.3. Reviewer satisfied.
The formulation of the principle loses most of the context; what about "User agents SHOULD NOT silently recover from errors"
Propose editorial response.
Did not change Error recovery GPN in section 1.2.3 per reviewer's suggestion, but text changed based on other comments.
The TAG believes that distinguishing "error correction" (errors that can be corrected as though they never happened) from "error recovery" (situations where the agent cannot correct the error) will improve the text. IJ to incorporate that change.
The language extension definition seems awkward in comparison with the usual terminology; an extension to XSLT or SOAP or most languages I can think of is used to designate the additional part of the language, rather than the superset of the language including the basic language + the additional part. That is, if A is the basic language, B, the additional part, extensions usually refer to B rather than A+B.
The TAG agrees to change the definitions so where "Extended language" is used to mean "A+B", "Extension" is used to mean "B".
Incorporate change into text.
Extended language: A+B. Extension: B
The message of the section is not clear; "important interfaces are defined in terms of protocols"... rather than?
Add a second paragraph to 1.2.4 explaining what TBL said about resilience as typical design goal in protocols.
Modified first paragraph to talk more about large-scale protocols v. traditional software APIs.
"In modern textual data formats, the characters are usually taken from the Unicode repertoire" Is that really the state of affair, or rather what modern textual formats should do? It seems that the adjective "modern" is used to suggest that it's what textual data formats should do to be modern, rather than to describe the current reality.
Propose editorial response.
In section 4.1, changed text in second para to "Increasingly, internationalized textual data formats refer to the Unicode repertoire [UNICODE] for character definitions."
Should this be reworded in "User agents MUST NOT silently ignore authoritative metadata."? If so, is it still worth mentioning? (ie, is there any point in saying "do what the protocol says to do"?)
Propose editorial response.
Adopted reviewer suggestion in principle of section 3.4.1 (consistent with other changes regarding authoritative metadata). Reviewer satisfied.
The TAG asked the Editor to compress sections 3.4 and 3.4.1 into a single section 3.4. The title of the section will be "Message semantics". Reviewer satisfied.
Discussion of "authoritative metadata" removed in favor of limiting discussion to that about data/metadata inconsistencies. Reviewer's comment about protocol semantics may be captured by first para.
(The reviewer analyzes the various subjects of the GPNs and raises some questions.) "In any case, my general comment is that it would be better to reduce the list of conformance subjects in the arch document: - to avoid some of the fuzziness introduced by having non-defined conformance subjects - to make it easier for the reader to understand the requirements."
Propose editorial response.
Attempted to following reviewer's suggestions. Eliminated "resource owner" in favor of "URI owner". Similarly, changed "user agent" to "agent" in GPNs. Changed "author" to "content author", "server manager" to "representation provider", and "developer" to "software developer". In the GPNs, changed "language designer" and "format designer" to "Specification" (as the subject). Moved note about format v. language to section 1.1.1 and introduced phrase "specification designer" as an encompassing term. Reviewer satisfied.
I would take the principle and put it together with the nice 1 sentence explanation. String together all principles, contraints, and good practices, with the explanatory material in the box, you'd have a very, short, accessible summary of the whole document. My main interest in such a re-organization, is that many of the principles are stated as buzzwords. They mean a lot to people who already know them, but are pretty abstract otherwise. I think that this short overview of the content of the arch. document would help make what is a somewhat long document much more accessible and likely to be used.
Propose editorial response.
Created a summary; probably not quite what reviewer wants.
I was reviewing the OASIS XRI (eXtensible Resource Identifier) Specification [2], one of the components of the Dataweb model [1]. I'm curious, if this new work affect in any way the *WebArch*?
Some of the "stories" in the document seem pointless to me, I suggest dropping or expanding them: expand story in 2.6; story in 3.6 is too unlikely and extreme, might be dropped; expand story in 3.6.3; expand or drop story in 4.5.3.
Propose editorial response.
Added forward reference in section 3.6.1 per reviewer suggestion. Moved story from beginning of section 3.5 to a few paragraphs in. Moved one sentence from 3.5 story to section 3.5.1 story. Incorporated other editorial suggestions except the definition of Link in section 5.
Does a URI *reference* or identify a resource? Is there even a difference? I'm unsure here, but the choice of words might cause confusion.
The TAG believes that the document is sufficiently clear regarding reference/identify as is.
Send TAG a draft of a response to reviewer in light of decision.
First para says "before inventing a new data format, designers should carefully consider re-using one that is already available" but the whole doc doesn't seem to say why all XML formats shouldn't be application/xml.
IJ and CL to draft a proposal to address this issue. (No clear direction from 14 May 2004 minutes, but there was discussion about whether the content was "designed for presentation".)
"Namespaces in XML provides a mechanism for establishing a globally unique name that can be understood in any context." What does it mean to understand a name? Should this say that the globally unique name can unambiguously identify the intended meaning of the element/attribute?
Subsumed.
The word "understood" has been deleted in the rewrite.
Below the Good Practice: QName Mapping - the section (or some other) should probably say more on the interaction of QName Mapping, fragment identifiers in XML (4.5.8) commonly used for this mapping and namespace documents (4.5.4)
Overtaken by events.
Seek clarification on kopecky5:
DC believes this message from reviewer indicates he has completed his action.
The TAG accepted the proposal.
Address reviewer's comments (see DC action).
Added "One particularly useful mapping in the case of flat namespaces is to combine the namespace URI, a hash ("#"), and the local name; see the section on XML namespaces for more examples."
This section lacks a conclusion, any kind of statement on what should/should not be used. Or words that at the moment there is no conclusion.
The TAG believes this is an open issue in Web architecture, which the Editor should highlight more in the text.
Overtaken by events.
Clarify that this is an open issue.
In section 4.5.6, added note: "The TAG expects to continue to work with other groups to help resolve open questions about establishing "ID-ness" in XML formats."
The TAG accepted the proposal.
The headline should be worded so that it is easy to remember and says the right thing (rather than it's opposite)....
Propose editorial response.
Changed titles of GPNs that reviewer wished changed. Now only using principle, constraint, good practice (in that order). Also highlighted in abstract.
3.3 Internet Media Type: Streamlining terminology. For URIs, a clear distinction is made between Resource and Representation. But when it comes to fragid, this distinction is suddenly gone, without any explanations. (See email for more detail)
Respond to MD, acknowledging the dependency between arch doc and RFC2396bis.
See Reply from MD
Scenarios appear to be based on "static" URI's; i.e.: "persistent" URI's (reference chapter 3.6.2). Suggest discussion of "dynamically generated" URI's; particularly addressing situations where dynamic URI's are bookmarked or forwarded by a user.
Ask the reviewer to clarify the question. The TAG believe the reviewer has misunderstood the notion of "URI persistence".
[Empty]
The statement "one might reasonably create URI's that ..." in the following passage may be inappropriate, as the preference for viewing a resource in Italian or Spanish should be communicated as meta information within the context, for which mechanisms such as CC/PP are being developed. To countenance the use of non-unique URI's for such a purpose is unwise.
The TAG believes that it is useful to indicate that there are two resources (one Spanish and one Italian) but to add to the example some discussion of content negotiation.
Propose editorial response.
No change. It is probably useful to have a URI for a resource where language is content-negotiated, but it is also useful to have a URI for the "resource in language L".
Add some text that talks about content negotiation in addition to leaving the distinct URIs (one per language resource).
Deleted the language-specific URIs since they do not identify the same resource. Deleted the example but augmented the discussion of content negotiation per the 14 May ftf meeting.
Suggest discussion of accessing different representations (i.e.: transformed) of the same URI. This relates to Device Independence Principles DIP-1 (Device Independent Access) and DIP-2 (Device Independent Web Page Identifiers). Mention of HTTP Content Negotiation appears insufficient.
Propose editorial response.
In section 3.4, Added "or transformed dynamically to the hardware or software capabilities of the recipient".
No version info in media types? Media type as special case of content negotiation?
The TAG agrees that more needs to be stated about trade-offs. The TAG has created a new issue mediaTypeManagement-45.
Write a draft finding on the benefits and limitations of the media type mechanism.
Add text to the document about media type limitations, versioning, and link to issue mediaTypeManagement-45.
Added: "Internet media type mechanism does have its limitations. For instance, media type strings do not support versioning or other parameters. The TAG issue mediaTypeManagement-45 concerns the appropriate level of granularity of the media type mechanism."
While I suspect that the older language for describing these semantics had its own problems, I would be happier either with (1) its return or (2) some further amplification or clarification of the existing language.
Include a reference to RFC2396 in the document. Inform reviewer of TAG's resolution.
Included reference to RFC2396 definitions of primary and secondary resource.
See writeup from KC.
AWWW abjures URI ambiguity; but in trying to think carefully about this, I've realized that it's important to distinguish two kinds of URI ambiguity: diachronic and synchronic. The AWWW only addresses the former kind, and I think it should address the latter kind, too. I'd like to see some language in the AWWW about avoiding synchronic ambiguity by avoiding the "URI overloading" mistake with content negotiation.
The AWWW says that one may conclude that agents or representations are each referring to the same resource if they are using identical URIs. But that's problematic; it suggests that the relation between resources and URIs is in some sense timeless and static. Once a URI has been coined to identify a given resource, it can only ever identify precisely that resource; else, we have to embrace the willy-nilly change problem.
The TAG believes the reviewer's question is addressed by section 3.6.2 of the document.
Respond to reviewer.
The first good practice says, in my paraphrase, that (1) good representation types allow users to make links to other resources and to parts of representation-states of resources. The second good practice says, again in my paraphrase, that (2) good representation types allow users to make "Web-wide" links rather than merely "internal document" links. Aren't these redundant?
Surely the AWWW also wants to say that for those kinds of web application or scenario -- Service Oriented Architecture and Semantic Web being the two obvious examples -- where hypertext is not the "expected user interface paradigm", by virtue of the fact that there really isn't a UI per se, one still wants to prefer representation types which allow users to make hypertext links between resources. REST and SOAP and RDF and WSDL and a lot of other fun stuff works precisely because -- even in the absence of any human-facing UI -- what's happening is that messages are being passed around between machines, some of which contain assertions about resources, and they are messages which contain hypertext links to other resources.
The real problem here is that there is no real formalization of "hypertext link" in the AWWW. If it means A-HREF links simpliciter, then my point about SOA and Semantic Web exceptions to this practice is unmotivated and null. But if, as seems likely from Section 4.5.2. Links in XML, "hypertext links" encompasses any link mechanism (that is, XLink and friends) whereby HTTP URIs identify resources with which agents may interact with the resources-states thereof, then something like my point is needed.
Incorporate DC suggested tweak for section 2:"Formats that allow content authors to use URIs instead of local identifiers foster the "network effect": the value of these formats grows with the size of the deployed Web."
Incorporated.
I don't agree with the exceptionless form of this principle. I think one can imagine silent error recoveries which aren't harmful. I suggested an amended version: silent error recovery is harmful if, and only if, it does some harm beyond mere failure to notify; or, put better: mere failure to notify isn't always a harm. (I'd be just as happy with the smallest possible weakening of the principle, something like: "Silent recovery from error is usually [or "typically" or "often"] harmful."
The TAG believes that text from the approved finding "Authoritative Metadata" will address the reviewer's concerns. Henceforth, refer to issue dhm2.
Propose editorial response.
Good practice note in section 1.2.3 no longer talks about "silent recover" but rather recovery "without user consent". Added text from finding: "Consent does not necessarily imply that the receiving agent must interrupt the user and require selection of one option or another. The user may indicate through pre-selected configuration options, modes, or selectable user interface toggles, with appropriate reporting to the user when the agent detects an error." Updated GPN in section 3.4.1 as well.
Updated proposal
Distinguish error correction from error recovery.
This is often harder than the AWWW lets on, and sometimes it's simply not possible at all. I think the language should be modulated to reflect that reality; see xproposed text
Propose additional text regarding separation of content and presentation that includes more about tradeoffs.
"the ambiguous use of terms" is ambiguous; and, contrary to the AWWW's (fairly casual, of course) claim, ambiguity does *not* always impose a cost in human communications -- a research result demonstrated by UK cognitive scientists, among others. (If you want the full cite to this paper on CiteSeer, I can drum it up.)
Propose editorial response.
Attempt to soften claim about cost of overloading by adding "often" in first paragraph of section 2.2.
Updated proposal
Issue may be subsumed and mooted by this draft.
There is a paragraph about URI ownership in Section 3.4, and I can't understand what it's doing there. I would strike or amend it. Full discussion of this issue.
Propose editorial response.
I believe that edits to section 3.4 may satisfy the reviewer's comments; that text has been removed.
I really really hate the way this discussion is framed in the AWWW; see proposals.
I find this sentence, from Section 2. Identification, to be garbled at best; see discussion.
Propose editorial response.
Edited sentence in section 2: "Formats that allow content authors to use URIs instead of local identifiers foster the "network effect": the value of these formats grows with the size of the deployed Web."
I think this, as stated, is too strict; see further discussion.
The Editor will add text about the good practice of using server-side redirects to connect two resources when people start using "the wrong URI".
Propose editorial response.
In section 2.1.1, added last paragraph: "When a URI alias does become common currency, the URI owner should use protocol techniques such as server-side redirects to connect the two resources. The community benefits when the URI owner supports both the "unofficial" URI and the alias.".
The first two comments from the reviewer are editorial.
Propose editorial response.
This draft addresses the reviewer's editorial comments.
When determining the uniqueness of a URI, is the fragment identifier considered part of the identifying URI? If there is an argument list, does the ? and what follows constitute part of the unique URI?
The Editor believes this question is answered by the draft.
The Editor believes this question is answered by the draft: The string is compared character for character, and a URI string can include a fragment identifier.
Excerpt: "The reader is left with the impression that 'obviously, the legal requirement justifies the actual practice. It constitutes an overriding concern.' This is not necessarily the case. Please don't reinforce misconceptions. Particular if they have consequences that reduce the universality of access to web mediated transactions."
The TAG believes that the 10 May 2004 draft addresses the reviewer's comment.
Propose editorial response.
Removed contentious text from section 4.3. Text now reads: "Of course, it may not always be desirable to reach the widest possible audience. Designers should consider appropriate technologies for limiting the audience. For instance digital signature technology, access control, and other technologies are appropriate for controlling access to content."
The reviewer explained why "orthogonality" was the wrong term to use. See email for details.
The TAG believes that "orthogonal", not "independent" is the proper term. E.g., the HTTP specification depends on the URI specification, but they are orthogonal. The TAG also decided to remove the term "loosely coupled" and to change "independent" to "may evolve independently."
Incorporate TAG's resolution.
The reviewer suggested a number of editorial corrections
Propose editorial response.
This draft incorporates reviewer's editorial suggestions.
The validity of this principle depends very much on what level one is talking about. Is silent recovery from packet collisions harmful? With an ECC memory: should every correctable memory problem be reported? Must an application that normalizes input data so that out-of-range values are normalized to the valid extreme of the range report every bad data item? To whom? (Suppose the data stream represents instrument measurements streamed to a Web display.) We suggest that the rule be:
Silent recovery from errors may hinder problem diagnosis. Furthermore, silent recovery of errors resulting from erroneous input may inappropriately promote use of non-compliant data formats."
We also note that there is a tension between this principle and the notion of "must-ignore". For many applications, "what you don't understand" is equivalent to "an error". So one principle says you should ignore (presumably silently) this data, and the other says you should not.
I agree with this principle in cases where the way in which the agent recovers affects the resulting service provided to the user. There are error cases where this is not the case - for example, a 401 in http where the problem was a stale nonce, or an error indicating that a message was lost and should be retransmitted. In such cases, silent recovery probably makes sense.
Henceforth, refer to issue dhm2.
Propose editorial response.
Good practice note in section 1.2.3 no longer talks about "silent recover" but rather recovery "without user consent".
Updated proposal
Distinguish error correction from error recovery.
[Section 2] assumes that identification and retrievability are the same thing. Given the extensive use, starting with namespaces, but continuing with the identification of XSLT and XQuery functions, and so on, of using URIs to identify non-retrievble and abstract entities, this conflation is problematic at best.
Decided at Ottawa f2f. (No action.) The new text about information resources is believed to address this issue.
The architecture document needs to do a better job of explaining what a resource is in this context. (See email for more info)
Decided at Ottawa f2f. Added 2.5.2 Representation reuse.
"Good practice: Fragment identifier consistency: A resource owner who creates a URI with a fragment identifier and who uses content negotiation to serve multiple representations of the identified resource SHOULD NOT serve representations with inconsistent fragment identifier semantics.
If the term "consistent" is here used in a technical sense, please explain what it means and how inconsistencies are to be detected. If it is used in a non-technical sense, please explain what it means.
We note that if fragment identifiers must be usable in more than one MIME type, the result will be that the only fragment identifiers effectively allowed will be bare names (or other fragment identifier syntaxes incapable of knowing about or exploiting any of the structure of the data); it seems undesirable to impoverish the URI identifier space in this way.
In general, content negotiation (like server-side browser sniffing) does not seem to us to be an obviously and universally good thing: it leads to unpredictable context-dependent results in ways that are actively hostile to some machine-driven applications, and it interacts in this pernicious way with fragment identifiers. On the other hand, if content negotiation is indeed important to make things work, perhaps some advice on whether newly invented schemes should support the equivalent of content negotiation is in order.
This is not a viable best practice recommendation, except as a bandaid, as it tightly couples URIs to representations, and constrains representation evolvability in untenable ways. This appears to highlight a weakness in the Web architecture that should be explicitly addressed.
See msm13.
Paste RFC2396 text in and ask the Schema WG to review.
In section 3.3.2, Added some text from RFC2396 bis per 3 May teleconf. The new text does NOT say "don't use content negotiation".
[3.3.1] says: "Per [URI], in order to know the authoritative interpretation of a fragment identifier, one must dereference the URI containing the fragment identifier. The Internet Media Type of the retrieved representation specifies the authoritative interpretation of the fragment identifier. Thus, in the case of Dirk and Nadia, the authoritative interpretation depends on the SVG specification, not the XHTML specification (i.e., the context where the URI appears)."
But this seems to contradict the referenced URI specification, which says:
"The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type [RFC2046] of the retrieved representation, even though such a retrieval is only performed if the URI is dereferenced."
The latter says clearly you need not dereference. On the contrary, you must know the range of representations that you might get _if_ you tried to dereference.
Decided just prior to the Ottawa f2f. Incorporating CL’s edits of 3.3.1 has fixed this issue (by making the webarch document the same as 2396bis.
[3.3.1] says
"Given a URI "U#F", and a representation retrieved by dereferencing URI "U", the (secondary) resource identified by "U#F" is determined by interpreting "F" according to the specification associated with the Internet Media Type of the representation."
What if the scheme is not HTTP and media types are not used (e.g. because the URI uses the file: scheme or for some other reason)? Do fragment identifiers work only with media-typed representations? We hope not.
[3.4.1] Says that user agents must not silently ignore server metadata. Metadata covers a lot of ground: what is its scope? May a user agent ignore a server-specified DTD or Schema and choose to apply a local variant (e.g. because the user so specifies in a local configuration file or a launch-time option)? Why not?
If the sender is not a trusted authority, it would be foolish for the recipient to rely on the principle of sender-makes-right. A well written production server runs an unacceptable risk if it accepts at face value everything an untrusted client tells it. Must it inform the client each time it follows its own instructions by ignoring client information?
Overtaken by events.
(We also note in passing that focusing on the interactions between "user-agents" and "servers" is fundamentally limiting in the sense mentioning in our opening comment. Are not peer-to-peer interactions covered by this architecture?)
Overtaken by events.
[3.5] says that an interaction is safe if the agent does not incur any obligation beyond the interaction. This seems too broad; the TAG has been advised of other scenarios. For example, if each access to a resource needs to be authenticated at the application (not https) level, but no ongoing obligation is established, this rule suggests that the retrieval is safe. Is that really true? We wouldn't want the access cached, except perhaps by an application-specific cache that knew our authorization rules. Consider also the case where the provider of the resource needs to log the access. The issue is an important one, and the summary given here comes close to being an oversimplification.
[3.5.1] Says:
"There are mechanisms in HTTP, not widely deployed, to remedy this situation. HTTP servers can assign a URI to the results of a POST transaction using the "Content-Location" header (described in section 14.14 of [RFC2616]), and allow authorized parties to retrieve a record of the transaction thereafter via this URI (the value of URI persistence is apparent in this case). User agents can provide an interface for managing transactions where the user agent has incurred an obligation on behalf of the user."
Yes, but is this saying specifically that content-location SHOULD be used? If so, so. If not, then make clearer what's intended.
[3.6.1] Good practice: Available representation Publishers of a URI SHOULD provide representations of the identified resource.
We are concerned that this appears to privilege dereferenceable URIs over other sectors of URI space; in particular to denigrate all uses of URIs as pure (non-dereferenceable) identifiers, such as namespaces, QT functions, SOAP extensions, SAX properties, etc. etc. There are often pragmatic reasons for declining to make URIs dereferencable (unwelcome load on servers, for example, or identifiers that are intended purely for software systems and that humans will never see or need to dereference to obtain useful information). It seems to us that at least a coherent story should be told about how this pure-identification use fits into the overall Web architecture.
Decided at the Ottawa f2f. Added good practice about unnecessary network access.
[4.2] In general, the section on versioning unduly and in too many ways oversimplifies a complex, subtle, and as yet poorly understood problem. For example, 4.2.3 says:
"Language designers SHOULD provide mechanisms that allow any party to create extensions that do not interfere with conformance to the original specification."
This oversimplifes a very tough tradeoff. When you allow such extensions, you promote reuse of the base language for new purposes, and that seems good. You also provide for a proliferation of potentially non-interoperable versions depending on various extensions, as well as ensuring that some data will be accepted by processors when it is in fact not conforming to a later or extended definition of the language, but is simply erroneous and ought (if the processor were only omniscient) to be rejected as such with a useful diagnostic. That's bad.
Pursuing the principle enunciated here, one might conclude that maybe XML should have let anyone who wanted to define new syntactic constructs such as structured attributes? They didn't, and interoperability is helped rather than hurt by such strictness.
There is a strong tension between versioning and extensibility and silent error handling, once you get away from human mediated interactions and interactions that do not involve mission- or life-critical applications. For computer-to-computer mission-critical applications, "fallback behaviour" is semantically equivalent to "silently handling errors" and the Web architecture document is thus self-contradictory.
In addition, versioning and extensibility are not solely a property of data representations, but of protocols as well.
The TAG agrees with the reviewer that the text does not communicate why extensibility may not be appropriate in some cases. Furthermore, the TAG has resolved to delete the phrase "falling back to default behavior".
Propose text on tradeoffs for section 4.2.2.
Delete "falling back to default behavior."
Deleted "falling back to default behavior" in section "Extensibility"
Insert a story in the section on extensibility about protocol extensibility.
[4.2.3] The discussion of mustIgnore & mustUnderstand should clarify the difference between marking the distinction in the document instance, in a schema, or in prose documentation. SOAP does it with an attribute in the instance. Schema content models do it in the schema. Other systems provide rules in the specifications. These have different tradeoffs.
Write some text to address schema14 (e.g., by reviewing finding).
[4.2.4] Says:
"In principle, a SOAP message can contain a JPEG image that contains an RDF comment which refers to a vocabulary of terms for describing the image."
This is untrue: SOAP is XML, JPEG is not. MTOM may do something to extend SOAP to make this true, but as it stands the statement is false. Perhaps "... can contain an SVG image that contains ..." is what you meant to write.
The TAG believes the 10 May 2004 draft addresses the reviewer's concerns.
Satisfied with editorial change in 10 May 2004 draft.
Propose editorial response.
In section 4.2.4, changed JPEG to SVG.
Change approved.
[4.5.1] is on when to use XML-based formats. The analysis here seems underdeveloped and may perhaps best be left out. If it is kept, then additional reasons for using XML documents include:
Classified as editorial, addressed by the editor.
[4.5.3] States:
"Namespaces in XML" [XMLNS] provides a mechanism for establishing a globally unique name that can be understood in any context.
This is a false statement and should not be continued to be repeated.
Propose editorial response.
Per 22 March teleconf, deleted "that can be understood in any context" from section 4.5.3.
Incorporate changes per resolution.
[4.5.3] Says:
"The type attribute from W3C XML Schema is an example of a global attribute."
This should indicate type in the Schema Instance namespace, preferably with a suitable link to our spec. Perhaps
"The type attribute from W3C XML Schema namespace is an example of a global attribute."
There are also type attributes in the schema document vocabulary, e.g. on <xsd:element>, and those are not global. Furthermore, we see above in 4.5.6 that a prefix is used to indicate xs:ID as a type. So, why not use xsi:type for this one:
"The xsi:type attribute, provided by W3C XML Schema for use in XML instance documents, is an example of a global attribute."
Include a clearer reference to the XML Schema specification.
Satisfied with editorial change in 10 May 2004 draft.
Propose editorial response.
Adopted reviewer's proposal in section 4.5.3 to use "xsi:type".
Make changes to point to XML Schema spec (possibly with the xsi ns URI in text).
Para now starts: "The type attribute from the W3C XML Schema Instance namespace "http://www.w3.org/2001/XMLSchema-instance" ([XMLSCHEMA], section 4.3.2) is an example of a global attribute. It can be used by authors of any vocabulary to make an assertion in instance data about the type of the element on which it appears. As a global attribute, it must always be fully qualified. "
[4.5.3] Says:
"Attributes are always scoped by the element on which they appear. An attribute that is "global," that is, one that might meaningfully appear on different elements, including elements in other namespaces, should be explicitly placed in a namespace. Local attributes, ones associated with only a particular element, need not be included in a namespace since their meaning will always be clear from the context provided by that element."
This appears to mix the notion of element instance and what DTD-oriented minds would call 'element type'. Perhaps this should read
An attribute that is "global," that is, one that might meaningfully appear on elements of any type, including elements in other namespaces, should be explicitly placed in a namespace. Local attributes, ones associated with only a particular element type, need not be included in a namespace since their meaning will always be clear from the context provided by that element."
The TAG agreed with the reviewer.
Satisfied with change in 10 May 2004 draft.
Propose editorial response.
Adopted reviewer's proposal in section 4.5.3: "An attribute that is "global," that is, one that might meaningfully appear on elements of any type, including elements in other namespaces, should be explicitly placed in a namespace. Local attributes, ones associated with only a particular element type, need not be included in a namespace since their meaning will always be clear from the context provided by that element."
[4.5.6] Fails to careful highlight the particular flavours of "ID" in play, and that they are NOT the same thing. For example, consider the following three statements:
In practice, applications may have independent means of specifying IDness as provided for and specified in XPointer. XPointer carefully discusses these options.
The TAG believes that the 10 May 2004 draft addresses the reviewer's concern, with the following changes:
Clarify that this is an open issue.
In section 4.5.6, added note: "The TAG expects to continue to work with other groups to help resolve open questions about establishing "ID-ness" in XML formats." Also added fourth bullet per reviewer's suggestion.
Adopt text per TAG resolution.
New fourth bullet: "In practice, applications may have independent means (such as those defined in the XPointer specification, [XPTRFR] section 3.2) of locating identifiers inside a document."
The end of the reviewer's mail suggests some editorial changes.
Propose editorial response.
Adopted editorial suggestions except:
There is a problem with http URI implying GET as a default method and not allowing any other method for retrieval. (See remainder of email for details).
The reviewer made a number of editorial comments.
The sample principle "self-descriptive markup" makes me nervous: surely the TAG does not believe that XML (or any other system) is a self-describing format in the sense that anyone looking at any instance of the format will understand what is going on without having to have recourse to any documentation? Neither XML nor any other format or notation matches this description. Some formats are or can be self-describing in that the notation can be used to describe the notation: one can write a grammar in BNF for BNF itself, and one can write a schema in XML to define the XML vocabulary for writing schemas. But such recursion is possible, surely, primarily for notations which are intended to be used for defining notations.
The TAG resolved to keep the drop shadow for aesthetic reasons.
Propose editorial response.
Adopted editorial suggestions except for removing the shadow from the first illustration.
Part of 10 May 2004 draft.
The first paragraph of this section is incomprehensible to me; I hope it can be rewritten.
What does it mean to define an identifier without knowing what representations are available? Representations for what? For the identifier? For the thing identified? For something else entirely? In general, I would have thought that before assigning identifiers to things it would normally be useful to know what one was identifying. In programming languages, what one identifies with identifiers are typically representations of things (representations of integers, representations of character strings, etc.); in that context, it seems nonsensical to say without qualification (as is done here) that identifiers can be assigned without any knowledge about available representations; it is only a knowledge of the available representations and their characteristics that allows one to decide intelligently what different things need to be distinguished and thus what different things will need to be identified.
Overtaken by events.
Propose editorial response.
Adopted editorial suggestion.
It's clear that the world would be a better place if specifications were more consistently implemented and their nuances more consistently observed. It's not quite clear to me that the world will be a better place if we assign all authority for document metadata to the server and remove all possibility of overriding it in the document itself. The principle enunciated or illustrated here works well when the systems administrator responsible for the server knows the character encoding, content-type, etc., of each resource served, cares about serving correct metadata, and knows how to configure the server to achieve that result. It works less well when any of those conditions ceases to apply.
It is not unusual (in my experience, at least) for the author or provider of a document to know more about it than the maintainer of the Web server; if the in-line metadata and the metadata provided by the server are in conflict, it is not always my experience that the server is right and the author wrong, and it troubles me to see the web architecture document effectively disenfranchising the latter in favor of the former.
Section 3.4.1, Principle: Authoritative server metadata, says "User agents MUST NOT silently ignore authoritative server metadata" and discusses the responsibility of server managers in the provision of metadata.
This principle appears to mean that the only first-class citizens of the Web are server managers. Any content provider in the position of controlling the content, but not the server configuration, is at the mercy of the server manager; this situation is unproblematic when the server manager takes seriously the responsibilities assigned here; it seems likely to lead to problems in organizations where a typical exchange between content provider and webmaster runs like this:
Content Provider: This document needs to be served in UTF-8, not ISO Latin-1.
Webmaster: I'm busy, I don't have time for this kind of thing, so get lost.
Content Provider: Also, the expiration time should be thirty days, not two hours.
Webmaster: Close the door on your way out, OK?
It seems to me that local authority on metadata would be an approach more consistent with the principle of decentralization which governs Web architecture in other respects.
1.2.2 para 5 ("Ideally, many ..."). Sentence 2 ("Languages that exhibit this property are said to be 'extensible'") seems to say that if an instance of a larger language can be processed as though it were an instance of a smaller language, then the larger language is said to be "extensible". I think the term is probably better taken as referring to the smaller language; I think the paragraph should probably be rewritten from scratch, since with the current structure it will be difficult to provide a clear antecedent of the phrase "this property".
In any case, the current formulation invites the reply that OF COURSE some instances of a superset language may be processed as if they were members of a subset language: in any plausible case, a large number of members of the superset language ARE members of the subset language; that is what it means for one language to be a superset of another. I think the instances you wish to refer to particularly are those members of the superset language which are NOT instances of the subset language, but which can nonetheless successfully be processed by a suitable processor. The analysis here is weakened by its failure to acknowledge explicitly that the property in question is not a property of the language by itself but a property of the particular kind of processing involved, and the coding of the processor. (Here as elsewhere the document appears to fall into the trap of speaking as if only one kind of processing were liable to be applied to any particular document, or any particular language; this is not the case for any language intended to promote the reuse and repurposing of data, and that fact is of material importance in any discussion of extensibility.)
1.2.2 para 5 ("Ideally, many ..."). I think the end of the paragraph would be more persuasive if "ignore" and "treat as error" were not the only examples given of default processing rules. The "ignore" approach is (as formulated here and elsewhere) an oversimplification. It is both underspecified and excessively specific. Underspecified, because most proponents do not distinguish between ignoring the unknown element and ignoring the tags which mark the beginning and ending of the unknown element, and some participants in the discussion fail to understand the difference, as is illustrated by the following paragraph of this document. Excessively specific, because ignoring is not the only plausible default processing rule, and in many contexts it's easy to think of a better. A pretty-printer should use its default line-break and indentation rules; a search system should use its default indexing rules; an editor should use its default display; a transformation system should perform the identity transform, or suppress the element (is this the same as 'ignoring' it? I don't think so), or perform another default action (such as the default action specified by XSLT). These do not all seem to me to be the same as "ignoring" either the element or the tags.
1.2.2 para 6 ("For example, from ...") says "from early on in the Web, HTML agents followed the convention of ignoring unknown elements." I think you mean to say "ignoring unknown tags" instead; I think many of your readers will, like me, understand "tags" as marking the beginnings and endings of "elements", and many of them may, like me, be confused at first by this sentence, which seems to mischaracterize browser behavior. At least in my experience, browsers ignore tags, such as the start- and end-tags of 'blink' elements, but do not ignore elements, which is why the content of 'script' elements must be wrapped in comments in order to avoid problems with earlier browsers, which ignore the 'script' tags, but not the 'script' element. (It is possible that this sentence is intended to refer to some different browser behavior, such as ignoring the absence of explicit end-tags, or the typical decision by browsers to ignore violations of SGML's element nesting rules; in this case, reformulating the sentence is even more important.)
Section 2, introductory paragraphs. In the introduction to this section, the failure of the document to make any serious attempt to define the term 'resource' begins to bite you -- and more to the point, begins to cause problems for the reader. I recognize that it's difficult to define 'resource' well, but I believe it essential that you try. If definition proves absolutely impossible, you can of course take it as an undefined primitive notion, but to make that approach useful I think you would need to specify explicitly the relations which are postulated as holding between resources and other primitive notions.
In the current draft, you are making things too easy on yourselves; the document suffers.
Some questions one might hope to have some light shed on by either a definition or by a non-defining description of resource as a primitive notion:
Section 2 para 3 says
When a representation uses a URI (instead of a local identifier) as an identifier, then it gains great power from the vastness of the choice of resources to which it can refer.
This suggests that URIs have the advantage, compared to local identifiers, of being more numerous. But if we assume that both URIs and local identifiers are finite-length strings without any length restriction we need worry about, then both sets are enumerably infinite and there is a one-to-one mapping between them, so that they have exactly the same cardinality and neither is any more vast than the other.
I suspect that what is meant here is that URIs have the advantage of being dereferenceable; this is true of some URIs, but not, I think, of all.
Overtaken by events.
Might be subsumed.
Might be subsumed by new text in 2 and 2.1
Section 2, Principle: URI assignment says: "A resource owner SHOULD assign a URI to each resource that others will expect to refer to."
In order to comply with this principle, it seems to be necessary for resource owners to know what resources they own, or (equivalently) to know, of each thing they own, whether it is a resource or not. It doesn't seem plausible to expect compliance with this principle if "resource" is not defined more informatively than it is defined in this document.
It may also be noted in passing that this principle also requires that resource owners predict what other actors will expect; it would be nice if the principle could be reformulated without requiring owners to perform such predictions.
Note also that if resources can be any "items of interest" (as stated by section 1), it may be impossible for a resource owner to provide URIs for every resource which may be an item of interest. If there is an owner of the real numbers, for example, that owner cannot comply with the principle enunciated here. If anyone owns an infinite set of items of interest, and if sets of such items are thought to be themselves potential items of interest, then that owner cannot, in principle, provide URIs for all items of interest: the power set of an enumerably infinite set is not enumerable, and neither URIs nor any other finite names can be provided for all the members of a non-enumerable set.
I wonder if some slightly less demanding principle ought to be enunciated.
Overtaken by events.
Might be subsumed.
Might be subsumed by new text in 2 and 2.1
Section 2.2, bulleted list, first item. It would be useful, I think, if this were expounded at greater length. It is not necessarily clear to all readers (it is, for example, not entirely clear to me) how the hierarchical delegation here postulated follows from the wording of the specifications defining the HTTP and mailto schemes.
This reader wonders at this point whether there are any constraints on the interpretation which the definer of a media type can place on fragment identifiers for the media type. Can one, consistent with Web architecture (if not necessarily with good design) define a media type (let us say application/sortes) where the meaning of a fragment identifier is identified by taking a checksum of the octet string returned, the conventional numerological value of the string used as the fragment identifier, multiplying the one by the other, and using the product to look up a passage in a copy of Vergil, with the stipulation that the meaning of the fragment identifier is "the meaning of the passage found by this method"?
Section 3.3.2, para 3 ("On the other hand ...") says "it is considered an error if the semantics of the fragment identifiers used in two representations of a secondary resource are inconsistent." What does "inconsistent" mean here? How do the responsible parties determine whether a given plan of using fragment identifiers is or is not compliant with this rule?
Suppose that an internet media type (application/my-magic-mediatype) is defined with the basic rule that it is represented by servers as an XML data stream rooted in a particular namespace (e.g. one for purchase orders), and that its fragment identifers are syntactically identical to those of the application/xml media type, but denote not individual XML elements or attributes but instead whatever real-world objects are represented by those elements or attributes (a customer, an invoice, a payment obligation, ...), if any, or else have no denotation.
Suppose further that a resource owner serves the same octet sequence as two different media types (e.g. application/xml and application/my-magic-mediatype). Is the resource owner (a) obeying the principle enunciated here, given that the denotations of the fragment identifier in the two cases stand in a predictable and plausible relation to each other? or (b) violating this principle, given that in the two cases the fragment identifier identifies objects of radically different classes (XML elements on the one hand, people and other non-XML entities on the other)?
The TAG resolved to make the following changes to the document:
Propose three examples for section 3.3.2.
Revise the text of section 3.3.2 per the resolution. The editor notes that the future of the GPN in that section is uncertain.
Section 4.2.2, Story. The text says that defining a new optional "lang" attribute on a "film" element does not affect the conformance of any existing data or software. This isn't quite true: it changes some invalid data (data with the undefined attribute "lang") into valid data, and some non-conforming software (software which erroneously accepts that invalid content) into conforming software.
The text is correct if, but only if, the universe of discourse is restricted to valid data.
- the 2nd sentence of the 2nd paragraph in section 2.5 says "For robustness, Web architecture promotes independence between an identifier and the identified resource.". Should it not say "... the identified resource and its representations."?
I concur with the XML Schema WG's comment that the document is too focused on browser-based interactions rather than on the more general problem of automata interaction. I understand the TAG's reluctance to tackle the Web-vs-Web-services issue, but I think it's important for AWWW to at least give the impression - if not outright say - that there exists solutions to the automata integration problem within the constraints/guidelines/principles of Web architecture. Some other examples in section 3 would help there.
The reviewer's first, fourth, and fifth points are editorial.
in section 4.5.2, I'm uncomfortable with the recommendation to use XLink when using XML, except perhaps when authoring documents which are intended for human consumption. I believe that RDF/XML provides superior linking capabilities for XML than does XLink, and IMO preference should be given to it. Alternately, listing both as options would be adequate.
The reviewer made a number of editorial comments on section 1.2.1. See email for details.
The reviewer made a number of editorial comments on section 1.2.2. See email for details.
The reviewer asked a number of questions about the principle of error recovery. See email for details.
See issue clark5.
The reviewer made a number of editorial comments on section 1.2.4. See email for details.
Parties who wish to communicate must agree upon a shared set of identifiers and on their meanings.
This is false. A baby communicates distress and discomfort to his or her parents without there being any identifers, or even any identification going on on the part of the baby. I might be able to communicate that this large bolder crushing my leg should be removed by the stout and helpful non-english-speaking lass beside me by making somewhat spastic gesticulations. Or, in a more structured way, I might point at the bolder, or wap the bolder, and make a little rolling motion with my hands.
A number of editorial comments followed
The identification mechanism for the Web is the URI.
Presumably this isn't *quite* right, as there is a need for some idenification mechanisms that are not URI based in order to associate (some, at least) URIs with resources for subsequent reidentification. Also, for example, host names identify things very critical to the functioning of the web, and yet, aren't URIs. Etc.
A URI must be assigned to a resource in order for agents to be able to refer to the resource.
Even restricted to software agents, this is false. _x foaf:mbox <mailto:bparsia@isr.umd.edu>.
Allows an OWL Reasoner to refer to me (since foaf:mbox is an InverseFunctionalProperty). (While there was a URI involved, it wasn't assigned *to me*.) I can make or refute assertions about me in this way.
Overtaken by events.
Replace constraint at beginning of section 2 with a new principle and constraint:
Resources exist before URIs;
If URIs are strings, and string are abstract mathematical entities (i.e., a kind of data structure) independant of their physical instantiation, then, reasonably, URIs have always existed, so any particular URI has existed before some recently come into existent Resources. I'm not even sure of the point of such metaphysical statements. Or imagine I have, oh, a programming language where I have URI objects (a subclass of String). Let's say I want to use a URI to identify some other objects in my system. Does this claim require that (in pseudopython):
my_object_uri = URI('http://blahblah.com/blah') #The URI now exists!
my_funky_object = FunkyObject() #Now the Resource in question exists.
my_object_uri.assigned_to(my_funky_object)
is broken in some way? Why would this matter?
a resource may be identified by zero URIs.
Ah, this is what you mean? It's not very happy either. I take it you mean that some resource might *not* be identified by *any* URI. Cool. And given my above example, it might still be possible for agents to refer to it. Naturally, it's often a good idea to give various resources a URI! For example, I don't think it's possible (or, at least, easy) to *link* to something in a machine readable way in HTML. So, give such resources URIs, please. I think it's quite possible to make the sensible point without appeal to broken metaphysics.
Actually, the rest of the paragraph seems quite good and sensible.
Principle: URI assignment. A resource owner SHOULD assign a URI to each resource that others will expect to refer to.
I would recommend the TAG study FOAF because that community has made a different choice (i.e., to rely a lot on inverseFunctionalProperties).
Aside from that, I think this principle misses an important point: Formats and protocols should (often?) be designed to use URIs. This encourages URI assignment by adding value to such assignment.
I don't see how to usefully split the discussion of this section, as the first line of 2.2 refers to 2.3.
The requirement for URIs to be unambiguous demands that different agents do not assign the same URI to different resources.
But it's ok that the same agent does so? Sorry if that sounds snarky, but it is a genuine question. I would have thought that the primary case was the resource owner...thus far, in the document, no one else seems to have the power to *assign* the same URI to different resources. E.g., principle: URI Assignment, the first line of 2.1, Good Practice: URI aliases.
(Hmm. In 2.1, right after Good Practice: URI Aliases, we get the notion of URI Producers. Who are they? Frankly, I don't find the arbitrary shifting of terms of art in a technical document to be helpful. None of these terms are defined in the glossary, either. Perhaps this isn't a technical document? If so, I'm going to feel a bit sheepish reading it this closely. Section 1.1.1 suggests that it is meant to be a guide for technical people like me (I do all of 1-4). Hmm. Section 1.1.2 says: """This document strikes a balance between brevity and precision while including illustrative examples.""" Ok, that gives me some standard. I think the above claim doesn't strike the right balance between brevity and precision.)
There's another reading, to wit, that the *URI* can be unambiguous (say, because ambiguity of a URI is defined to be having being assigned to more than one resource, and assignment can literally only be done by the URI owner, and the URI *can* successfully only assign it to one resource (none of these are obvious!)) while various agents can *use* it ambiguously. On this reading, the requirment of URI ambiguity does *not* demand that different agents do not *use* the same URI to refer to different resources. (I don't know if you mean "assign" up there in the restricted sense I was using in the definition of ambiguity, or just to mean "refer", or something else.)
Section 2.3 says that the ambiguous *use* of URIs is to be avoided (though, I'll point out, that the Good Practice is ambiguous between ambiguous URIs and ambiguous *use* of URIs).
Of course, certain ambiguity doesn't matter, e.g., replicating Quine, I might use a URI to refer to me, the human being, and someone else to refer to the collection of undetatched people parts. As long as all our uses *align* in (all) our interactions, we're fine, ambiguous assignment or not.
Sorry for the quick digression into philosophy of languages, but, really, at this time of night, I feel a little justified in turn around :)
Hierarchical delegation of authority. This approach, exemplified by the "http" and "mailto" schemes, allows the assignment of a part of URI space to one party, reassignment of a piece of that space to another, and so forth.
First use of 'URI space', which is undefined. I see 'information space', 'uniform address space', and, of course, 'namespace'. As far as I can tell, only 'namespace' has a definition (and it's not in this doc, which is fine). Perhaps this is only editorial. A URI space seems clear (a set of URIs? why not say that then?), but I did spend some time wondering if it was the same as an infromation space or address space. *Are* you using unambiguous phrases here? Are they aliases? Is there a problem with either defining terms or using only one where there's only one concept? Some principles of the web apply well to technical prose.
Whatever the techniques used, except for the checksum case, the agent has a unique relationship with the URI, called URI ownership.
Here is what I can find on what's an "agent", prior to this passage:
Within each of these systems, agents (people and software)
strate typical behavior of Web agents \x{2014} people or software (on behalf of a person, entity, or process) acting on this information space. Software agents include servers, proxies, spiders, browsers, and multimedia players.
So, an agent is a person or a program. Thus, every http uri has, supposedly, one, and only one, person or program that is its owner. However, institutional ownership seems possible, as is joint ownership.
The social implications of URI ownership are not discussed here. However, the success or failure of these different approaches depends on the extent to which there is consensus in the Internet community on abiding by the defining specifications.
First you say that the social implications of URI ownership are *not* discussed here, then go on to discuss some social implications. Don't do that.
I don't believe the second statement of that quote, at least on many interpretations, and I've objected to its use in various technical arguments, some with TAG members. If this passage is to be a stick to beat me with in technical debate in W3C working groups, then I strenuously object to it, especially without substantial explication and clarification. So, I make the strong comment that I want this line struck. I object to it.
Is anything in this document normative? I notice that there is some rejection of adding a conformance section, which is fine, but I have *NO* idea how to use this document in working groups, nor do I know how it may be used by others. I totally fail to see how this can be helpful. So, I would like some guidance about that.
It is tempting to guess the nature of a resource by inspection of a URI that identifies it. However, the Web is designed so that agents communicate resource state through representations, not identifiers. In general, one cannot determine the Internet Media Type of representations of a resource by inspecting a URI for that resource. For example, the ".html" at the end of "http://example.com/page.html" provides no guarantee that representations of the identified resource will be served with the Internet Media Type "text/html". The HTTP protocol does not constrain the Internet Media Type based on the path component of the URI; the server is free to return a representation in PNG or any other data format for that URI."
First sentence talks about inferring the *nature* of a *resource* by URI inspection (i.e., inferring that <http://ex.org/#BijanThePerson>> rdf:type Person. from the URI alone). But the third sentence through the rest of the paragraph talks about inferring the Mimetype of the *representation* of the (state of) the resource. If you mean to discourage both practices, some serious reworking is in order.
Resource state may evolve over time. Requiring resource owners to change URIs to reflect resource state would lead to a significant number of broken links. For robustness, Web architecture promotes independence between an identifier and the identified resource.
I just wonder how this is different from:
Resources may come and go over time. Requiring resource owners to abandon URIs to reflect resource non-existence woudl lead to a significant number of broken links. For robustness, Web architecture promotes independence between an identifier and the identified resource."
Of course, you might say that abandoning URIs isn't what's required, but rather maintaining legacy state. But then you've either changed the resource (to something "representing" the nonexistence resource), or you return representations reflecting the state of a nonexistence resource. Of which there isn't any.
(Note that I'm not talking about imaginary entities, but ones who have ceased to exist.)
The logic of avoiding broken links suggests that temporal URL ambiguity might be useful for Web robustness (which might not be the same as correctness).
Good practice: URI opacity. Agents making use of URIs MUST NOT attempt to infer properties of the referenced resource except as licensed by relevant specifications.
This says nothing about not inferring properties of the retrieved representations.
Note: The Web Architecture does not require a formal definition of the commonly used phrase "on the Web." Informally, a resource is "on the Web" when it has a URI and an agent can use the URI to retrieve a representation of it using network protocols (given appropriate access privileges, network connectivity, etc.).
Given that Web Arch doesn't require it, I would recommend not including even an informal definition. Especially as it seems wrong, e.g., such that tel:+1-816-555-1212 and any URN identified resources aren't on the web (though they are possible subjects and objects (and I guess predicates; URNs certainly) of RDF assertions). Is there a relation between being "on the Web" and being, er, part of the "information space" that is the web?
So, I think this note is strike worthy.
Overtaken by events.
Delete note about "on the Web".
Deleted note about "on the Web" from the beginning of section 3.
Incorporated in 10 May 2004 draft.
Successful communication between two parties using a piece of information relies on shared understanding of the meaning of the information.
I'll spare you the critical analysis of the opening platitude of a section of a document. It's not clear to me, however, that they are, in fact, useful.
Arbitrary numbers of independent parties can identify and communicate about a Web resource. To give these parties the confidence that they are all talking about the same thing when they refer to "the resource identified by the following URI ..." the design choice for the Web is, in general, that the owner of a resource assigns the authoritative interpretation of representations of the resource.
So, this is "in general", which suggests that "in specific" this might not be the case. For example, when the owner of the resource, uh, *gets it wrong*. One example is ""Inconsistencies between Metadata and Representation Data"".
So, let's generalize. What if the owner of the resource gets the *information* encoded in the message wrong? Is that authoritative? What would that mean? Suppose I retrieve a representation of my purchase order, does the resource owner have an authorative interprestion of the *meaning of the order*, interpreting my "5 very cheap things, please" as "5000 hugely expensive things, you bastard!!!"?
There is a sensible thing buried in here, I think. I think it's quite right to be judicious in ignoring narrow, well understood and somewhat verifiable represenation metadata. One example (if there were a media type for OWL-DL and OWL-Full as well as RDF) would be interpreting a retrieved ontology as OWL-DL vs. just as RDF. Different inferences are licenced, and there are times where one might want to publish the ontology for RDF interpretation only.
Of course, really, it would be best if the format provided a way to specify this.
The second bullet doesn't make much sense; it seems to argue that META tags are capable of specifying new HTTP headers on the fly. While I suppose someone could make up a new header and stick it in a META tag, this really isn't the point. It also surmises that this somehow leads to the feature not being widely deployed; I believe that the reason is much more down-to-earth; performance.
[T]he TAG might be interested to know that an IANA registry of HTTP headers should be established soon; depending on the schedule of the AWWW, it may be possible to directly reference it in this section.
It is common for programmers working with the Web to write code that generates and parses these messages directly. It is less common, but not unusual, for end users to have direct exposure to these messages. This leads to the well-known "view source" effect, whereby users gain expertise in the workings of the systems by direct exposure to the underlying protocols.
It was not clear to me what is the intended significance of this with respect to Web Architecture. Suggest: explain the significance or drop this paragraph.
A URI must be assigned to a resource in order for agents to be able to refer to the resource. It follows that a resource should be assigned a URI if a third party might reasonably want to link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or perform other operations on it.
"or perform other operations on it" suggests a resource should be a very concrete thing. Suggest "or refer to it in some other way".
When a representation uses a URI (instead of a local identifier) as an identifier, then it gains great power from the vastness of the choice of resources to which it can refer. The phrase the "network effect" describes the fact that the usefulness of the technology is dependent on the size of the deployed Web.
The comment about "network effect" in the first para seems somewhat disjoint. What does it tell us about Web architecture? Suggest: "This vastness of choice gives rise to a "network effect", which refers to a technology's usefulness increasing more rapidly than the size of the network across which it is deployed"
A URI must be assigned to a resource in order for agents to be able to refer to the resource. It follows that a resource should be assigned a URI if a third party might reasonably want to link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or perform other operations on it.
[...]
Resources exist before URIs; a resource may be identified by zero URIs. However, there are many benefits to assigning a URI to a resource, including linking, bookmarking, caching, and indexing by search engines. Designers should expect that it will prove useful to be able to share a URI across applications, even if that utility is not initially evident.
There seems to be some overlap between these paragraphs. And I found the first sentence of the second paragraph to be potentially confusing.
Suggest: a re-arrangement:
A URI must be assigned to a resource in order for agents to be able to refer to the resource. It follows that a resource should be assigned a URI if a third party might reasonably want to link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or refer to it in some other way. A resource may exist independently of whether or not it has a URI; one or more URIs may be used to identify a given resource.
[...as before...]
There are many benefits to assigning a URI to a resource, as noted above. Designers should expect that it will prove useful to be able to share a URI across applications, even if that utility is not initially evident.
The scope of a URI is global; the resource identified by a URI does not depend on the context in which the URI appears (see also the section about URIs in other roles). Of course, what an agent does with a URI may vary. The TAG finding "URIs, Addressability, and the use of HTTP GET and POST" discusses additional benefits and considerations of URI addressability.
The term "global" here is not defined or qualified. Suggest "global across the Web".
...For example, the parties responsible for weather.example.com should not use both "http://weather.example.com/Oaxaca" and "http://weather.example.com/oaxaca" to refer to the same resource; agents will not detect the equivalence relationship by following specifications.
and
... Agents should not assume, for example, that "http://weather.example.com/Oaxaca" and "http://weather.example.com/oaxaca" identify the same resource, since none of the specifications involved states that the path part of an "http" URI is case-insensitive.
While correct, I felt this was potentially a little confusing. The first example did not seem well chosen to reflect the point I think is being made. Suggest:
...For example, the parties responsible for weather.example.com should not use both "http://weather.example.com/Oaxaca" and "http://weather.example.com/Mexico?city=Oaxaca" to refer to the same resource; agents will not detect the equivalence relationship by following specifications.
Hmmm, maybe there's a third point to be made here, namely that the party responsible for some domain should avoid using different URIs with small, easily overlooked differences?
Hierarchical delegation of authority. This approach, exemplified by the "http" and "mailto" schemes, allows the assignment of a part of URI space to one party, reassignment of a piece of that space to another, and so forth.
While technically correct, I don't think 'mailto' is a useful example of hierarchical delegation of naming authority within a URI structure. I'd suggest 'ftp:' or 'urn:' or 'file:' or 'ldap:'
URI ambiguity should not be confused with ambiguity in natural language.
I'm not sure what this sentence is trying to say (what is meant here by "confused with"). From what follows, I think the intent is to say something like "justified by", in which case I think something like:
URIs should not be permitted the ambiguity that occurs in natural language.
[...existing text...]
This flexibility is not available to URIs, which should be defined to refer to a single concept.
[Later,] I ran across this from TimBL in one of the Tag IRC logs, which seems to capture the point more effectively.
Suggested text for 2.6: Whereas human communication tolerates such ambiguity, machine processing does not. Strictly, the above URI as identifies the information resource, some hypertext document. RDF applications which use it for describing properties of that page are in order; those who use its URL to directly assert properties of the whale are using it inconsistently.
The use of unregistered URI schemes is discouraged for a number of reasons:
This doesn't seem to be strong enough. Suggest:
The use of unregistered URI schemes is not a permitted part of the Web architecture, for a number of reasons:
Resource state may evolve over time. Requiring resource owners to change URIs to reflect resource state would lead to a significant number of broken links. For robustness, Web architecture promotes independence between an identifier and the identified resource.
I think a link to orthogonality (section 1.2.1) may be appropriate about here.
Although many URI schemes are named after protocols, this does not imply that use of such a URI will result in access to the resource via the named protocol. Even when an agent uses a URI to retrieve a representation, that access might be through gateways, proxies, caches, and name resolution services that are independent of the protocol associated with the scheme name.
As phrased, I find this to be at odds with the text that follows, cf. numbered items 4/5/6. Suggest replace:
use of such a URI will result
with
use of such a URI will necessarily result
The TAG agreed with the proposal to add "necessarily."
The TAG agreed with the proposal to add "necessarily".
Add "necessarily"
Added "necessarily" to third paragraph of section 3.1.
Incorporated in 10 May 2004 draft.
On the other hand, it is considered an error if the semantics of the fragment identifiers used in two representations of a secondary resource are inconsistent.
This seems a rather odd statement to make (specifically: "it is considered an error ...", because there is no specific way to determine if the would-be erroneous condition actually arises. Suggest: drop this paragraph; the intent is clear enough from the following good practice point.
The TAG agrees with the reviewer's point, but has decided to keep the text and clarify it. At their 13 May 2004 ftf meeting, the TAG resolved:
Overtaken by events.
Improve text regarding responsibility for inconsistent frag id semantics (looking at new RFC2396 text).
Add text from RFC2396.
In section 3.3.1 added text from RFC2396. Also deleted: "On the other hand, it is considered an error if the semantics of the fragment identifiers used in two representations of a secondary resource are inconsistent."
Incorporate TAG resolution.
Rewrite story at beginning of 3.3.1. See resolution to delete "Note..." to end of the paragraph.
For a given resource, an agent may have the choice between representation data in more than one data format (through HTTP content negotiation, for example). Since different data formats may define different fragment identifier semantics, it is important to note that by design, the secondary resource identified by a URI with a fragment identifier is expected to be the same across all representations. Thus, if a fragment has defined semantics in any one representation, the fragment is identified for all of them, even though a particular data format may not be able to represent it.
The term "by design" seems rather odd here. It seems to me that the (technical) design specifically does not achieve "the secondary resource identified by a URI with a fragment identifier is ... the same across all representations".
I think the clause "by design" could be dropped without loss (or, maybe, replaced with something like "by intent").
Successful communication between two parties using a piece of information relies on shared understanding of the meaning of the information. Arbitrary numbers of independent parties can identify and communicate about a Web resource. To give these parties the confidence that they are all talking about the same thing when they refer to "the resource identified by the following URI ..." the design choice for the Web is, in general, that the owner of a resource assigns the authoritative interpretation of representations of the resource.
I recall that TimBL and Pat Hayes had a lengthy debate about something rather like this. See Thread with some indication of consensus around this mail and this email. I am not sure that the above text really captures the subtlety of this discussion. As Pat Hayes noted:
>Note though that other non-RDF systems may and do use URIs. So the >principle can must be a general one of web architecture. Names are global in scope. OK, though (in the other branch of the discussion) I don't think this is going to be feasible, myself, if taken strictly. Still, I agree, its not a bad place to start, as long as we understand that we will eventually have to replace it with something more sophisticated.
Since Nadia finds the Oaxaca weather site useful, she emails a review to her friend Dirk recommending that he check out 'http://weather.example.com/oaxaca'. Dirk clicks on the link in the email he receives and is surprised to see his browser display a page about auto insurance. Dirk confirms the URI with Nadia, and they both conclude that the resource is unreliable. Although the managers of Oaxaca have chosen the Web as a communication medium, they have lost two customers due to ineffective resource management.
I think that "the managers of Oaxaca" should be "the managers of http://weather.example.com/".
There are strong social expectations that once a URI identifies a particular resource, it should continue indefinitely to refer to that resource; this is called URI persistence. URI persistence is a matter of policy and commitment on the part of authorities servicing URIs. The choice of a particular URI scheme provides no guarantee that those URIs will be persistent or that they will not be persistent.
The terminology "authorities servicing URIs" seems to be not consistent with that used elsewhere; e.g. "authority responsible for a resource" at the start of section 3.6.1., and "URI producers" in section 2.1.
As I draft this, I think there's maybe a deeper omission here: a lack of separation between the owner or authority responsible for a resource, and the authority for a particular part of URI space that may be used to identify a resource. (cf. also my previous comment above.) If not clarified, I think this could be a source of continuing miscommunication.
The TAG believes that the following minor changes to the document are sufficient to address the reviewer's concern.
Incorporate resolution into section 3.6.2
Inconsistent representations served. Note the difference between a resource owner changing representations predictably in light of the nature of the resource (the changing weather of Oaxaca) and the owner changing representations arbitrarily.
The term "predictably" here seems an odd choice given the nature of the illustrative example (thinks ... butterflies flapping in Beijing, etc.). Suggest: rationally.
Improper use of content negotiation, such as serving two images as equivalent through HTTP content negotiation, where one image represents a square and the other a circle.
This doesn't seem like a particularly helpful example, because in some contexts a circle and square may be genuinely different representations of a common underlying concept (e.g. alternative GraphViz presentations of an RDF graph). Suggest: "...such as serving two images as equivalent through HTTP content negotiation, where one image represents a weather map of Oaxaca and the other a street map of Chihuahua"
I made a note to myself at the end of this section: "Maye add a comment about metadata consistency and problems that may occur of a resource is not persistent" but now I not sure what it is I meant by this.
I think I may have been thinking about a case where RDF is used to describe some resource, but the resource whose representation is served at a given URI is allowed to change over time. Then, any RDF that uses said URI to describe the resource at some point in time becomes completely incorrect if the URI is assigned to a different resource. Is it worth trying to make a point that the value of RDF descriptions depends to a considerable extent on the stability/persistence of the URIs used?
I notice that in this section, the terminology used slips from "data format" or just "format" to "language", without any explanation that they mean pretty much the same thing in this context (or, if they don't, without any explanation of the difference).
RDF allows well-defined mixing of vocabularies, and allows text and XML to be used as a data type values within a statement having clearly defined semantics.
I couldn't figure precisely what this was trying to say.
Incorporate resolution.
Third bullet changed to: "The semantics of combining RDF documents containing multiple vocabularies is well-defined."
Note however, that for general XML there is no semantic model that defines the interactions within XML documents with elements and/or attributes from a variety of namespaces. Each application must define how namespaces interact and what effect the namespace of an element has on the element's ancestors, siblings, and descendants.
I think that there may be an important point to be made here about the relationship of the "Semantic Web" with what I might call the "Hypertext Web" upon which it is built, that the "Semantic Web" provides a well-defined way to combine statements that draw upon an arbitrary number of different namespaces. (I regard this as one of the more important contributions of the Semantic Web.)
Maybe this is what the subject of my previous comment was trying to say?
The TAG resolved (SW abstaining):
Overtaken by events.
Create new section 4.6
New section created: "Data Formats Used to Build New Information Space Applications"
Note that when content, presentation, and interaction are separated by design, agents need to recombine them. There is a recombination spectrum, with "client does all" at one end and "server does all" at the other. There are advantages to each: recombination on the server allows the server to send out generally smaller amounts of data that can be tailored to specific devices (such as mobile phones). However, such data will not be readily reusable by other clients and may not allow client-side agents to perform useful tasks unanticipated by the author. When a client does the work of recombination, content is likely to be more reusable by a broader audience and more robust. However, such data may be of greater size and may require more computation by the client.
I think there are also some scalability concerns that might be mentioned here; e.g. an application is, in general, more likely to operate at Internet scale if as much processing as possible is performed by user agents (often, clients) rather than centralized processing agents (often, servers).
Decided at Ottawa f2f.
Draft text to explain that there's a tradeoff in this situation.
Language designers SHOULD incorporate hypertext links into a data format if hypertext is the expected user interface paradigm.
I found this statement a bit puzzling: many data formats have nothing to do with a user interface; the preceding text says "What agents do with a hypertext link is not constrained by Web architecture and may depend on application context". So what is this trying to say?
I found the text of this section less clear than was offered in an email from TimBL:
It is important to distinguish between the string which identifies something and the BNF for a string in a document which is used to specify the first string. The first is an identifier. The second has been called a "reference". A reference can use a relative form.
...While it is directed at Internet applications with specific reference to protocols, the discussion is generally applicable to Web scenarios as well.
I am uneasy with this phrasing, as it seems to suggest the Web is somehow apart from the Internet. Suggest:
While it is directed at Internet applications with specific reference to protocols, the discussion is also applicable to Web application formats.
Another reference with discussion relating to this topic of choosing to use XML can be found in RFC3117, section 5.1
These Internet Media Types create two problems: First, for data identified as "text/*", Web intermediaries are allowed to "transcode", i.e., convert one character encoding to another. Transcoding may make the self-description false or may cause the document to be not well-formed.
The statement "Web intermediaries are allowed to "transcode ..." seemed to me to be rather broadly applied here. Is there a specification that asserts this in general? If not, I think the comment should be constrained to something like "in some Web applications, intermediaries are allowed to transcode.
Second, representations whose Internet Media Types begin with "text/" are required, unless the charset parameter is specified, to be considered to be encoded in US-ASCII. Since the syntax of XML is designed to make documents self-describing, it is good practice to omit the charset parameter, and since XML is very often not encoded in US-ASCII, the use of "text/" Internet Media Types effectively precludes this good practice.
I found this confusing, in that I wasn't clear what it was that was being said, and I couldn't see how it relates to the good practice point that immediately follows it.
[[A travel scenario is used throughout this document to illustrate typical behavior of Web agents -- people or software (on behalf of a person, entity, or process) acting on this information space.]]
This sentence defines "Web agents" as including both people and software (as opposed to just software). However, the usage of terms like "agent", "user agent", etc. throughout the document isn't always consistent with including people in addition to software in the definition (and sometimes, but not always, the phrase "software agent" is used explicitly in places where only software is clearly meant). Further comments will illustrate this point in specific instances.
[[This scenario illustrates the three architectural bases of the Web that are discussed in this document: 1. Identification. Each resource is identified by a URI. In this travel scenario, the resource is about the weather in Oaxaca and the URI is "http://weather.example.com/oaxaca".]]
It would be more consistent with the rest of the example if "the resource is about the weather in Oaxaca" read "the resource is *a report* about the weather in Oaxaca."
Further, given the potential generality of the things URIs can identify, it would be helpful if there were some discussion somewhere about distinguishing between URIs identifying such distinct things as:
The reviewer suggested a number of small editorial fixes; the editor will review them.
[[The section on Architectural Specifications includes references.]]
This sentence seems to end abruptly. References to what?
[[Authors of protocol specifications in particular should invest time in understanding the REST model and consider the role to which of its principles could guide their design: statelessness, clear assignment of roles to parties, uniform address space, and a limited, uniform set of verbs.]]
This sentence has an "interesting" structure. For one thing, "the role to which of its principles could guide their design" seems to mix several more usual constructions, e.g., either "the role its principles could [or should] *play* in their designs" or "the *extent* to which each of its principles could [or should] guide their designs". For another, it seems as if the list of principles should follow "principles" rather than "design", as in something like: "Authors of protocol specifications in particular should invest time in understanding the REST model and consider the role its principles -- statelessness, clear assignment of roles to parties, uniform address space, and a limited, uniform set of verbs -- could play in their designs.
[[A user agent acts on behalf of the user and therefore is expected to help the user understand the nature of errors, and possibly overcome them. User agents that correct errors without the consent of the user are not acting on the user's behalf.]]
Is "user agent" intended to be *any* kind of "agent" (human or software, as previously defined) acting on behalf of someone else (the "user", so far undefined), or just *software* that acts on behalf of a human?
Also, the text seems to equate "act on behalf of the user" with that action necessarily being helpful, which is not necessarily the way "act on behalf of" is always interpreted. The real point would seem to be that user agents that correct errors in this way may in some sense be acting on the user's behalf, but they aren't helping the user by doing it.
Third bullet: [[* An agent that encounters unrecognized content...]]
Given the context, this seems a bit ambiguous, since it might be taken to refer to "user agent", as well as more generally to "agent" (assuming these are different; are they?)
First para: [[Parties who wish to communicate must agree upon a shared set of identifiers and on their meanings.]]
"identifiers (names for things)"?
Second para: [[It follows that a resource should be assigned a URI if a third party might reasonably want to link to it...]]
Why "a third party" (what are the first two parties)?
[[Designers should expect that it will prove useful to be able to share a URI across applications, even if that utility is not initially evident.]]
Why is the reference here to "designers"? Designers of what? This sounds like it should instead be "resource owners", as mentioned in the principle "URI assignment" just below.
However, there is also a related issue, which is that the terms "resource owner", "URI owner", and "URI producer" are all used in further discussion. I can imagine situations in which these (or at least the first two) might have distinct meanings, but they don't necessarily seem to be used that way. If they are distinct, it would help to have precise definitions. If they aren't distinct, it would help to pick one term and use it consistently (along with some further explanation of why they are the same).
In particular, "Resource owner" and "URI owner" would appear to be equivalent when discussing URIs that refer to resources with retrievable representations. However, given that URIs can be created to refer to other kinds of resources, it would seem that multiple URIs might be created to refer to the same resource, and those URIs would have different owners. For example, suppose I (the person Frank Manola) am the resource. It seems to me I can reasonably claim to be the owner of that resource (whether I have assigned myself a URI or not; recall that resources can have zero URIs). Independently, other people may create resources with retrievable representations (e.g., reports) that refer to me and, perhaps not knowing the URI I have assigned to myself (even if I *have* assigned one), can create URIs to refer to me (say, in RDF statements). It seems to me those other people can reasonably claim to be the owners of those latter URIs (e.g., they determine that those URIs denote me), even though they don't own the resource the URIs identify (me). Moreover, these other people (more effectively) can create URIs for the resources with retrievable representations (reports referring to, among other things, Frank Manola), and those resources (the reports) are distinct from the resource Frank Manola. In this case, it seems to me that those other people are the owners of both the resources (the reports) and the URIs that identify them.
[[Principle: URI assignment: A resource owner SHOULD assign a URI to each resource that others will expect to refer to.]]
This seems as if it should read "A resource owner SHOULD assign a URI to each resource that the owner expects others will want to refer to." (How can others expect to refer to resources they don't necessarily know about?)
[[URI producers should be conservative about the number of different URIs they produce for the same resource. For example, the parties responsible for weather.example.com should not use both "http://weather.example.com/Oaxaca" and "http://weather.example.com/oaxaca" to refer to the same resource; agents will not detect the equivalence relationship by following specifications. On the other hand, there may be good reasons for creating similar-looking URIs. For instance, one might reasonably create URIs that begin with "http://www.example.com/tempo" and "http://www.example.com/tiempo" to provide access to resources by users who speak Italian and Spanish.]]
Why does the first sentence refer to "URI producers" that "produce" URIs rather than "resource owners" that "create" them (which would be more consistent with earlier text). I also note that words "assign", "create", and "produce" (and possibly others) are all used for what seems to be the same idea.
Also, the rest of this illustration seems to have a funny interaction with the URI opacity principle in Section 2.5 (especially the discussion there about the travel example), since the Section 2.1 text above seems to suggest there is value in being able to convey information to an accessing "agent" (a human in this case) via the form of the URI itself (i.e., if URIs are to be totally opaque to the "agent", why would there be value in using one language over another?). Of course, this may be just another problem in allowing "agent" to refer to people. However, the problem seems somewhat more acute if the result of dereferencing URIs in different languages is the retrieval of the report in the corresponding languages because, while this kind of makes sense, it also invites determining the language of the report from the language of the URI.
[[The requirement for URIs to be unambiguous demands that different agents do not assign the same URI...]]
Now we have *agents* assigning URIs rather than, e.g., resource or URI owners. It's not clear that this is consistent with prior discussion.
[[The concept of URI ownership is especially visible in the case of the HTTP protocol, which enables the URI owner to serve authoritative representations of a resource.]]
This text is pertinent to the point raised earlier about resource vs. URI ownership, and might be expanded on a bit to clarify that relationship. In particular, when dealing with URIs that have retrievable representations, it is straightforward to demonstrate ownership; non-owners can't determine what is returned when dereferencing such URIs, while owners can.
[[URI ambiguity should not be confused with ambiguity in natural language. The English statement "'http://www.example.com/moby' identifies 'Moby Dick'" is ambiguous because one could understand the phrase "Moby Dick" to refer to distinct resources: a particular printing of this work, or the work itself in an abstract sense, or the fictional white whale, or a particular copy of the book on the shelves of a library (via the Web interface of the library's online catalog), or the record in the library's electronic catalog which contains the metadata about the work, or the Gutenberg project's online version]]
This example illustrates an ambiguous natural language statement, but it's not clear that it doesn't also illustrate an ambiguous URI, since the text doesn't say anything about how example.org, or other parties citing http://www.example.com/moby, actually intepret it.
[[In Web architecture, URIs identify resources. Outside the bounds of Web architecture specifications, URIs can be useful for other purposes, for example, as database keys...]]
It seems to me this paragraph mixes a few things. Just because a URI is used as a database key doesn't necessarily mean it's being used for a different purpose. If a URI is used as a key in a relational table that associates metadata with the Web resources identified by those keys, and does so correctly (i.e., distinguishes between metadata about Nadia and metadata about her mailbox), it seems as if this is the *same* use of the URI (to identify a Web resource), even though it may also be used in the database to identify a distinct row in the table. Moreover, the database might exhibit URI ambiguity in the same way the Web might, e.g., by mixing metadata about both Nadia and her mailbox in the same row. At the same time, the use of "mailto:nadia@example.com" as an identifier for Nadia rather than her mailbox seems just as likely to occur in a Web context as in this database one (people seem to want to do it in RDF, for example; or is this not the part of the Web you're talking about?).
[[It is tempting to guess the nature of a resource by inspection of a URI that identifies it. However, the Web is designed so that agents communicate resource state through representations, not identifiers.]]
This is another place where including people in the definition of "agents" seems to create a possible difficulty. If agents include people, then people quite frequently communicate information about the nature of a resource by inspection of URIs, and it's very helpful. For example, "http://weather.example.com/oaxaca" certainly suggests that the resource it identifies has something to do with the weather in oaxaca (as is noted further on), and that's very useful information (e.g., when people pass those URIs around). That's certainly information about "the nature of a resource", and Internet Media Types aren't the only things relevant to people. This all, of course, reads much better if "agents" are restricted to software. Pursuing this point in the subsequent text:
[[Agents making use of URIs MUST NOT attempt to infer properties of the referenced resource except as licensed by relevant specifications.]]
This is good practice for software "agents". For people "agents", given the "must not", how do you propose to stop them? Further to this point, the text goes on:
[[The example URI used in the travel scenario ("http://weather.example.com/oaxaca") suggests that the identified resource has something to do with the weather in Oaxaca. A site reporting the weather in Oaxaca could just as easily be identified by the URI "http://vjc.example.com/315". And the URI "http://weather.example.com/vancouver" might identify the resource "my photo album."]]
This is certainly true. But while it's good practice for software to treat URIs opaquely, it seems to me that given the discussion in Section 2.1, which seems to license creating "descriptive" URIs in different languages to enable people speaking those languages to more easily access a resource (and which reflects the use of text in URIs as a means for conveying information to people), you might want to suggest that, given this "dual purpose" of URIs, it's *not* good practice to use the URI "http://weather.example.com/vancouver" to identify the resource "my photo album", even though one could, and it would be irrelevant to software.
Reference [RDF10] cites the RDF M&S Recommendation, rather than the new Recommendation set, and should be updated (the OWL reference should be updated as well). If *one* of the new RDF documents is to be cited, I would suggest Concepts.
[[Note that one can use a URI with a fragment identifier even if one does not have a representation available for interpreting the fragment identifier (one can compare two such URIs, for example). Parties that draw conclusions about the interpretation of a fragment identifier without retrieving a representation do so at their own risk; such interpretations are not authoritative.]]
This is a place where some qualifying context about the nature of the Web to which this architecture applies would have been helpful. For example, suppose I have a collection of RDF or OWL statements having as subjects the URI "http://www.example.com/images/nadia#hat", and the RDF/OWL statements assert that the subject is of class "Hat" in some ontology, that it's blue, and so on. On one hand, it seems as if one could reasonably draw conclusions about the interpretation of this fragment identifier (or rather the whole URI including it) *from the RDF/OWL* without dereferencing the URI (using the URI to retrieve a representation, whose media type specifies the authoritative interpretation), assuming that the RDF/OWL itself is from a sufficiently "authoritative" (in some sense) representation somewhere. Saying "such intepretations are not authoritative" without any further qualification or discussion, while it makes perfect sense given the way the Web works now, doesn't seem to take such additional usage (which, after all, is described in W3C Recommendations) into account.
The TAG intends to clarify the text to indicate that parties who draw conclusions from syntactic analysis of URIs alone do so at their own risk.
Edit "Parties that draw...." to be about drawing conclusions from syntactic analysis.
It would be helpful if the text following the "story" explicitly answered the question posed in the "story". For example, if the idea here is that the fragment should always identify Nadia's hat in any graphic representation provided by dereferencing the URI containing the fragment, it would help to say that.
First para: [[To give these parties the confidence that they are all talking about the same thing when they refer to "the resource identified by the following URI ..." the design choice for the Web is, in general, that the owner of a resource assigns the authoritative interpretation of representations of the resource.]]
The text "owner of a resource" links to Section 2.2 titled "URI Ownership". So why say "owner of a resource" rather than "owner of a URI"? Also, Section 3.3 just got through telling us that if the URI contains a fragment identifier, then the Internet Media Type of the retrieved representation specifies the authoritative interpretation of the fragment identifier. I realize that in one case it's the authoritative interpretation *of the fragment* and in the other its the authoritative interpretation *of representations of the resource*, but the use of "authoritative interpretation" in both places (particularly when they're so close together) seems potentially confusing.
[[User agents should detect such inconsistencies but should not resolve them without involving the user.]]
Now the term is "user agent" rather than "agents". Is there some particular reason for distinguishing between these terms?
[[Nadia's retrieval of weather information (an example of a read-only query or lookup) qualifies as a "safe" interaction; a safe interaction is one where the agent does not incur any obligation beyond the interaction. An agent may incur an obligation through other means (such as by signing a contract). If an agent does not have an obligation before a safe interaction, it does not have that obligation afterwards]]
Here, "agent" is used in a sense where it might well be a person ("signing a contract"). Can software agents "incur obligations" in the sense used here?
[[Other Web interactions resemble orders more than queries.]]
Is this "orders" in the sense of "placing an order", "that's an order, soldier", or both?
[[Principle: Safe retrieval Agents do not incur obligations by retrieving a representation.]]
Shouldn't this be "should not" rather than "do not"? "Do not" suggests that it doesn't happen, rather than that it's incorrect if it does (as described in the next sentence).
Following the story appears: [[The usefulness of a resource depends on good management by its owner. As is the case with many human interactions, confident interactions with a resource depend on stability and predictability. The value of a URI increases with the predictability of interactions using that URI. Avoiding unnecessary URI aliases is one aspect of proper resource management.]]
While the last sentence above certainly seems true, it's not clear what it has to do with the story, since the sentence refers to URI aliases, but the story describes multiple uses of the *same* URI returning wildly different results. Is this supposed to refer to "URI ambiguity" instead? It's also not clear that the problem illustrated in the story is necessarily "inconsistent representation". Given the diagram in Section 1, which distinguishes the URI from the resource it identifies, it seems possible to distinguish (at least conceptually) between:
It might help clarify the point made in this section if some examples of mistaken attempts to restrict the use of URIs were given, rather than just the building security analogy. Also, it's not clear whether or not the principle described here (and the further discussion in the "Deep Linking" finding) deals with all possible situations of this sort. For example, it certainly used to be the case (and may be the case now) that US Defense Department documents could not only have a security classification, but their *titles* might also have a security classification (that is, the *existence* of the document was classified). A classified document with an unclassified title could be referenced in the usual way, but a reader without the necessary clearance would be unable to access the referenced document (this would correspond to the situations already described). On the other hand, classifying the title of the document would prevent the reader from even seeing the reference without the necessary clearance. How would you suggest handling this situation (admittedly, opagueness of URIs would help!)
[[There is typically a (long) transition period during which multiple versions of a format, protocol, or agent are simultaneously in use.]]
This is another passage that reads strangely if "agent" is considered to include people (I know I'm not the same person I once was, but multiple versions simultaneously in use?!).
[[Good practice: Version information Format designers SHOULD provide for version information in language instances]]
What are "language instances"?
The TAG agreed to accept this change: "A format specification SHOULD provide for version information."
Adopt reviewer proposal.
GPN now reads: "A data format specification SHOULD provide for version information."
[[As part of defining an extensibility mechanism, a specification should set expectations about agent behavior in the face of unrecognized extensions.]]
The following good practice then says
[[Language designers SHOULD specify agent behavior in the face of unrecognized extensions.]]
It's not clear that a specification "setting expectations about" agent behavior is the same as it "specifying" it. Why the difference in wording?
Delete "As part of defining ..." sentence.
Delete "As part of defining ..." sentence.
Deleted "As part of defining an extensibility mechanism, specification designers should set expectations about agent behavior in the face of unrecognized extensions."
Third bullet [[ * RDF allows well-defined mixing of vocabularies, and allows text and XML to be used as a data type values within a statement having clearly defined semantics.]]
"...allows text and XML to be used as data type values..." (delete the "a")?
Within the same statement? What does "having clearly defined semantics" modify? Should this be "...within statements having clearly defined semantics"?
[[How do the application designers ensure that there are no naming conflicts when they combine elements from different formats (for example, suppose that the "p" element is defined in two or more XML formats)? "Namespaces in XML" [XMLNS] provides a mechanism for establishing a globally unique name that can be understood in any context.]]
I'd suggest rewriting this to avoid the rhetorical question.
Also, this may or may not be the right place to cite this example, but the way the XML Schema data types define a namespace for those types, allowing them to be used in RDF and OWL, might also be cited. (This example illustrates the need sometimes to explicitly describe how URIs identifying language elements should be constructed using those namespace names).
At TP2004, one of the presentations referred to some material in WebArch about extensibility [1]. Karl Dubost started a Wiki topic about it [2], because QA also deals with extensibility in the QA Framework (QAF) [3].
In [1] and [3], TAG and QA might have slightly different concepts behind similar terminology, and/or might be implying slightly different advice. Even though the QAF is being radically revised and trimmed down, it seems clear now that extensibility is one of the topics that will almost certainly survive in a new, leaner Specification Guidelines.
At this point, QAWG would like to suggest some liaison or further detailed discussion during the revision of our respective documents, with a view towards making our respective extensibility-related content consistent. We could participate in a joint teleconference, or QAWG could prepare a detailed look at the respective extensibility bits, or both.
The term 'language' is used both for natural language (in the Abstract and for (document/data) format. In the later case, 'format' should be used everywhere, to avoid confusion. This would be the same as in Charmod.
The term 'language' is used both for natural language (in the Abstract and for (document/data) format. In the later case, 'format' should be used everywhere, to avoid confusion. This would be the same as in Charmod.
'Oaxaca' is used in many examples. Glad to see a non-US example, but we are afraid that this may lead to questions on how to pronounce it in large parts of the world. We suggest to replace it with something simpler, one idea might be 'Lima' (although the weather is not as good there as in Oaxaca :-( ).
section 1, figure: please show charset in Content-type.
First bullet: This and http://www.w3.org/2001/tag/doc/whenToUseGet.html#i18n are basically okay. However the word 'limitations' (related to i18n) may give the wrong impression; it is not clear what the i18n concerns are. We suggest that you describe the issue more clearly, e.g. as "The design works reasonably well, although there are issues related to the transmission of non-ASCII characters." (please note the use of the word 'issues' rather than 'limitations'; although there are indeed some limitations as to the combinations of encodings in form pages and in requests, due to well-established practices based on HTML 4, there is no fundamental limitation to the basic use of non-ASCII characters. Also, please make sure the reader can directly go to the relevant section in the finding. Also, you may want to point to the FAQ on "What is the best way to deal with encoding issues in forms that may use multiple languages and scripts?" http://www.w3.org/International/questions/qa-forms-utf-8.html
Third bullet:
"Some authors use the META/http-equiv approach to declare the character encoding scheme of an HTML document. By design, this is a hint that an HTTP server should emit a corresponding "Content-Type" header field. In practice, the use of the hint in servers is not widely deployed. Furthermore, many user agents use this information to override the "Content-Type" header sent by the server. This works against the principle of authoritative representation metadata."
This is rather misleading on several points:
Error Handling: It might be very good to say something about character encoding/labeling errors.
Just before 2.1:
"Of course, what an agent does with a URI may vary."
It would be better to mention more explicitly that this e.g. can include language negotiation,...
2nd para. The first sentence establishes that character-by-character inequality doesn't mean that the resource referred is different. But the subsequent sentences say basically the opposite (that this is the most straightforward way to find resource equality). Break into two paragraphs, or otherwise improve wording to less confuse the reader.
Overtaken by events.
Comment addressed by this draft.
Comment addressed by this draft.
3rd para. The casing example for weather.example.com/Oaxaca is a bit obscure. Perhaps spell out the fact that case sensitivity matters to some systems?
For instance, one might reasonably create URIs that begin with "http://www.example.com/tempo" and "http://www.example.com/tiempo" to provide access to resources by users who speak Italian and Spanish."
It is nice to see an i18n-related example. However, there are all kinds of issues with this. This is not necessarily a good way to organize information in different languages on a server, in particular if the information is highly parallel. It may be better to find another example, for example with two English words. Also, 'tempo' is an English word with a different meaning. Perhaps German "Wetter" is better?
Merge sections 2 and 2.1
Deleted the example since these two URIs identify two different resources. This was part of a series of other changes to these early subsections of section 2. However, see section 3.3.2 for additional information on content negotiation.
4th para.
"Likewise, URI consumers should ensure URI consistency. For instance, when transcribing a URI, agents should not gratuitously escape characters. The term "character" refers to URI characters as defined in section 2 of [URI]".
The definition of 'character' in the first sentence is not clarified by section 2 of the URI draft, which deals with details such as percent escaping of characters. Section 1 of the URI draft *points to* a definition of 'character'. This is an area where the presence of IRI would be welcome. It might be more useful to describe what "gratuitious" means in this context (there is currently no definition; we *think* it means "don't escape characters unless it breaks usability", i.e. I would expect to see %20 instead of space (because space breaks the URI semantically).
'The term "character"...': please say instead: 'The term "character" in the foregoing sentence...' (character takes on other meanings later...)
para 2. Shouldn't there be a "but..." at the end of this paragraph? Yes, URI ambiguity is not the same thing as natural language ambiguity... but what is it? Please make the example more direct:
"URI ambiguity should not be confused with ambiguity in natural language. The English statement "'http://www.example.com/moby' identifies 'Moby Dick'" is ambiguous because one could understand the phrase "Moby Dick" to refer to distinct resources: a particular printing of this work,..."
This is highly ambiguous (sic!). Is 'http://www.example.com/moby' identifies 'Moby Dick' a statement in natural language, showing how natural language can be ambiguous? Or is it a statement about an URI, showing how URis can be ambiguous? Better change to say: "The URI http://www.example.com/moby is used ambiguously if it is used for more than one of the following: a particular printing of this work,..."
URI ambiguity. This may imply or suggest that natural language differences in the representation of a resource are considered bad. There should be examples of both good and bad ambiguity (or in WebArch terminology, different but consistent representations of the same resource as opposed to the use of a single URI for different resources), with language negotation being a good example and wholly different resources being a bad example
Missing a word in "URI ambiguity arises >>when<< (or 'as') a URI is used to identify two different Web resources.
Good practice: URI opacity: This says "Agents making use of URIs MUST NOT attempt to infer properties of the referenced resource except as licensed by relevant specifications." Earlier, the document defines 'agent' as both humans and machines.
This good practice is not too difficult to follow for agents (although this seems to disallow e.g. Google to consider pieces of an URI in their algorithms, e.g. the 'weather' and 'oaxaca' in 'http://weather.example.com/oaxaca'; we're not sure disallowing this is intended or makes sense).
However, this practice is *impossible* to follow for humans: It's just completely impossible to look at http://weather.example.com/oaxaca and NOT interfering that this may be about 'weather' or 'oaxaca'. The WebArch document itself is using this connection all the time. This is important in connection with IRIs.
Overtaken by events.
List item 2: "The XLink 1.0 [XLink10] specification, which defines the href attribute in section 5.4, states that "The value of the href attribute must be a URI reference as defined in [IETF RFC 2396], or must result in a URI reference after the escaping procedure described below is applied.""
This refers to the conversion from an IRI to an URI. It would be a good occasion to mention IRIs.
Maybe mention that editing tools may be more strict than simple user agents.
We believe that charset handling, the way it is currently specified in various specs (i.e. outer information has priority to inner information), is basically okay (with the exception of the (irrelevant in practice) iso-8859-1 default given in the HTTP spec, and the us-ascii default for text/foo+xml, which makes text/foo+xml rather useless. It might be good to reach some consensus about this, and document it.
Overtaken by events.
"Furthermore, server managers can help reduce the risk of error through careful assignment of representation metadata (especially that which applies across representations). The section on media types for XML presents an example of reducing the risk of error by providing no metadata about character encoding when serving XML."
This seems to pick out a somewhat arbitrary detail, without stating the much more important underlying principles, such as:
In response to the call for review of [1], several members of the RDFCore WG have submitted many detailed comments on the document [2][3][4][5] which we hope the TAG will find useful. We are confident the TAG will give due consideration to each comment on its merits with or without specific endorsement by the RDFCore WG. RDFCore requests, however, that the tag pay particular attention to comments that relate to:
We trust that the citation of the RDF Model and Syntax specification be replaced with a citation of the new RDF Concepts and Abstract Syntax document [6] in due course.
I am not sure it is emprically possible to decide which of the above two types of representation are correct (i.e. I think both are correct under specific circumstances)
That hits the nail on the head. The webarch document would be much more useful if it explained the two or so commonly held interpretations of Web architecture rather than trying to explain away all but one.
At the level of code, the Web just works, as always. But at the level of web developers making decisions about how to allocate URIs and so on, interpretations are the deciding factor.
If I write an HTTP server, and think of it as serving byte ranges of specific files, that's OK.
If I use a URI to make assertions about a remote galaxy, that's OK too.
The webarch document needs to document these interpretations.
I've been following most of the TAG discussions. Often there are several possible interpretations for some aspect under discussion, and often there is no empirical way to decide which one is correct. Both can be correct under specific circumstances. Such is the nature of any "architecture".
An amazing amount of work has gone into the webarch document. Good work. Still it fails to describe what could be called an architecture.
The document in question reads more like "building codes", the sort of things that can be universally agreed upon, and concretely argued. A room full of brilliant architects, however, will just never agree on a single design for anything non-trivial. Each will have a different philosophy on how things should be described, and each a unique approach.
In other words, the W3C consensus process (wildly successful as it is in many areas) is fundamentally the wrong way to write down "the Web architecture".
Some concrete suggestions for changes:
Reviewer observed that:
Reviewer suggested that:
The document list SOAP beside HTTP, FTP, NNTP and SMTP but the IETF see SOAP as a different thing than the other protocols as everyone else is transported on TCP while SOAP need some more "things" between TCP and SOAP. Normally, SOAP is transported on HTTP, SMTP or BEEP according to various specifications. This might be confusing for the reader if it is not clarified.
The section talks about Internet Media Type and Fragment Identifiers. These are two orthogonal things. The selection of media type for a resource is one thing, and potential use of fragment identifier another.
It would be good if the statement about Fragment identifier consistency was independent from the Fragment Identifier issues, because this statement should be general for a URI. Potentially it should be moved to the section earlier which talks about eqvivalence of URI's.
This is described somewhat in section 3.6 but...
It would be better if the stories only talked about how a web server should behave, and not show bad behaviour. In this case (the story directly below 3.5.1), it should say Content-Location header is used.
The title of the section implies the section will talk about "unsafe transactions" which to some readers might imply also discussion about use of SSL/TLS. I would recomment a new title as the section more talks about "stability and ability to bookmark" URI's.
It would be good if the overall differences between binary and textual data formats were listed, the most important one being the linebreak changes on textual data which in turn might make signing of data difficult/different.
The three (!) different link types have different meaning, and it should be spelled out when to use which.
The World Wide Web (WWW, or simply Web) is an information space in which the items of interest, referred to as resources, are identified by global identifiers called Uniform Resource Identifiers (URIs).
Thats a pretty broad definition; as it would include resources defined by other URI schemes (such as sip and tel) which I don't think are normally associated with the "web". The intent of the document, I think, is to generally think about resources that correspond to content.
The terms MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY are used in the good practice notes, principles, etc. in accordance with RFC 2119 [RFC2119]. However, this document does not include conformance provisions for at least these reasons:
This may be a nit, but RFC2119 is really meant to be applicable to protocol specifications, not practices and principles.
Authors of specifications SHOULD NOT introduce a new URI scheme when an existing scheme provides the desired properties of identifiers and their relation to resources
The inverse (converse?) is also true - you should reuse a scheme and protocol when they do have the desired properties. It might be a good idea to reference RFC3205 in this regard.
Overtaken by events.
There remain open questions regarding Web interactions. The TAG expects future versions of this document to address in more detail the relationship between the architecture described herein, ... voice-over-ip (including RTSP [RFC2326]).
RTSP does not qualify as voice over IP by most people's definitions. Its generally called "streaming media", and if you want to reference a VoIP protocol, try SIP (RFC 3261).
IETF has established an IANA registry for namespaces, so as to guarantee uniquess. It might be worth referencing the specification that creates this registry, RFC 3688.
Classified as editorial, addressed by the editor. Added the ref.
Last update: $Date: 2004/08/10 13:11:54 $
This page was generated as part of the Extensible Issue Tracking System (ExIT)
Copyright © 2003, 2004 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.