Skip to content

Improve support for cursors for SQL Server #1831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 2, 2025

Conversation

aharpervc
Copy link
Contributor

This PR is a followup to #1821 to address several difficulties I had parsing real world SQL files.

There are 3 related enhancements here:

  1. Introduce support for parsing OPEN statements, eg: OPEN my_cursor
  2. Expand existing FETCH statement parsing support the FROM keyword. Eg, FETCH NEXT FROM my_cursor. The logic formerly only supported "FETCH NEXT IN" syntax
  3. Introduce support for parsing WHILE statements, which is commonly used in conjunction with cursors. Eg WHILE @@fetch_status = 0.... This is a conditional statement block, much like IF & CASE, so that code has been structured similarly and placed adjacent to those statements.

(4th thing -- a test helper introduced in this PR was brought over here, because it simplifies validating a test case. If the other PR merges first that commit can be dropped out of this branch).

The effect of these changes is that this cursor example from the SQL Server FETCH documentation page now successfully parses:

DECLARE Employee_Cursor CURSOR FOR
SELECT LastName, FirstName
FROM AdventureWorks2022.HumanResources.vEmployee
WHERE LastName LIKE 'B%';

OPEN Employee_Cursor;
FETCH NEXT FROM Employee_Cursor;

WHILE @@FETCH_STATUS = 0
BEGIN
    FETCH NEXT FROM Employee_Cursor;
END;

CLOSE Employee_Cursor;
DEALLOCATE Employee_Cursor

- parse `OPEN cursor_name` statements
- enable `FETCH` statements to parse `FROM cursor_name`, in addition to the existing `IN` parsing
- it's a conditional block alongside IF & CASE
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub struct WhileStatement {
pub while_block: ConditionalStatementBlock,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't absolutely need a WhileStatement struct; we could be doing Statement::While(ConditionalStatementBlock) instead. I'm following the example of CASE & IF, which also do it this way.

src/ast/mod.rs Outdated
/// OPEN cursor_name
/// ```
/// Opens a cursor.
Open {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I placed this next to CLOSE because they're semantically paired, rather than alphabetical. Not sure what's preferred on this project

src/ast/mod.rs Outdated
/// Differentiate between dialects that fetch `FROM` vs fetch `IN`
///
/// [MsSql](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/fetch-transact-sql)
from_or_in: AttachedToken,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what's best here, it could also be two separate Optional fields

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we represent it with an explicit enum?
e.g

enum FetchPosition {
    From
    In
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

@aharpervc aharpervc marked this pull request as ready for review April 28, 2025 17:49
- this is useful since opening a cursor typically happens immediately after declaring the cursor's query
src/ast/mod.rs Outdated
/// OPEN cursor_name
/// ```
/// Opens a cursor.
Open {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we wrap this new statement in a named struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍. I didn't do that originally so as to more closely mimic the existing code

src/ast/mod.rs Outdated
/// Differentiate between dialects that fetch `FROM` vs fetch `IN`
///
/// [MsSql](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/fetch-transact-sql)
from_or_in: AttachedToken,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we represent it with an explicit enum?
e.g

enum FetchPosition {
    From
    In
}


#[test]
fn test_mssql_while_statement() {
let while_single_statement = "WHILE 1 = 0 PRINT 'Hello World';";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include a test case with multiple statements in the while block?

Also since this introduces a new statement, can we include a test case that asserts the returned AST?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include a test case with multiple statements in the while block?

This is covered by the additional subsequent examples

Also since this introduces a new statement, can we include a test case that asserts the returned AST?

Yes, I'll do that here for the initial example case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

canonical: &str,
) -> Vec<Statement> {
let statements = self.parse_sql_statements(sql).expect(sql);
assert_eq!(statements.len(), statement_count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assertion seems to already be covered by the if/else below? so that we can skip the statement_count argument requirement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't fully understand. Without this line you can't guarantee that the string you feed in has exactly the number of statements you intend it to parse. Also, one_statement_parses_to has this same assertion before the if/else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so I meant that in both cases when asserting :

in this case we're already explicitly check that both statements lists are identical

assert_eq!(self.parse_sql_statements(canonical).unwrap(), statements);

Then in this case, we're doing so implicitly, reconstructing the input sql based off the returned statement

assert_eq!(
                sql,
                statements
                    .iter()
                    .map(|s| s.to_string())
                    .collect::<Vec<_>>()
                    .join("; ")
            );

So that i imagine it shouldn't be possible for the count assertion to fail and either of the subsequent assertion to pass?

one_statement_parse_to uses the count internally and there it makes sense to sanity check that since the expected count is always one, in this case we're exposing the count as a function argument which makes for suboptimal API that the user has to manually increment/decrement a counter when the sql changes. So that if the count argument isn't strictly necessary we would want to skip it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then in this case, we're doing so implicitly, reconstructing the input sql based off the returned statement ... So that i imagine it shouldn't be possible for the count assertion to fail and either of the subsequent assertion to pass?

I will remove the assertion here to get this branch merged. However, I think removing it removes a level of safety that is beneficial. Part of my thinking here is motivated by my upcoming branch on making semi colon statement delimiters optional. So any code that is making assumptions about "number of statements" becomes even more useful.

But perhaps that branch can re-introduce that assertion if necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this helper was introduced over on the GO branch, does your opinion change at all seeing the usage over there?

@@ -8735,6 +8779,14 @@ impl<'a> Parser<'a> {
})
}

/// Parse [Statement::Open]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I missed it, we seem to be lacking test cases for the open statement feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's part of test_mssql_cursor, but I'll make a separate test function just for OPEN for clarity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

Comment on lines 4490 to 4492
if let Token::EOF = self.peek_nth_token_ref(0).token {
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we collapse this into above to use a match statement?

match &self.peek_nth_token_ref(0).token {
    Token::Word(n) if ...
    Token::Eof 
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍. I was probably trying to minimize the diff for review here

@aharpervc aharpervc force-pushed the mssql-cursor-usage branch from e0b484e to bf0036a Compare April 29, 2025 21:35
Comment on lines -4230 to +4273
write!(f, "FETCH {direction} ")?;

write!(f, "IN {name}")?;
write!(f, "FETCH {direction} {position} {name}")?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be write!(f, "{position} {name}")?;, not sure what the pro/con is on that

@aharpervc aharpervc requested a review from iffyio April 29, 2025 22:03
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @aharpervc!
cc @alamb

Comment on lines +749 to +751
let begin_token = self.expect_keyword(Keyword::BEGIN)?;
let statements = self.parse_statement_list(terminal_keywords)?;
let end_token = self.expect_keyword(Keyword::END)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have this pattern upcoming in a few places, like #1810 maybe it would be good to pull it out into a method and reuse it both here and the preexisting usage here? We can probably do so in the former PR instead

@iffyio iffyio merged commit a464f8e into apache:main May 2, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants