-
Notifications
You must be signed in to change notification settings - Fork 366
Support CODDTest for DuckDB #1224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late review. I looked through the files and added some comments and suggestions. Let me know once you've had a look at them. In particular, the DuckdbCODDTestOracle
is difficult to understand, and I think it will be tough to maintain and extend it in the future. However, I also understand that refactoring this now might be challenging. I'm fine with merging the current state, perhaps after addressing some of the easy issues, but it would also be good to think about what we could improve.
@@ -57,6 +57,14 @@ public static List<String> getExpressionErrors() { | |||
errors.add("Cannot subtract infinite timestamps"); | |||
errors.add("Timestamp difference is out of bounds"); | |||
|
|||
// added by CODDTest | |||
// errors.add("must appear in the GROUP BY clause or be used in an aggregate function"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we can remove these commented-out messages?
@@ -77,6 +77,25 @@ public class DuckDBOptions implements DBMSSpecificOptions<DuckDBOracleFactory> { | |||
@Parameter(names = "--max-num-updates", description = "The maximum number of UPDATE statements that are issued for a database", arity = 1) | |||
public int maxNumUpdates = 5; | |||
|
|||
public enum CODDTestModel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the same for all CODDTest implementations, perhaps we could extract this to a common enum with the next PR?
@@ -41,9 +41,19 @@ public static class DuckDBCompositeDataType { | |||
|
|||
private final int size; | |||
|
|||
// This is used to handle the type that out of the scope of our code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't understand the comment and use of this. Could we perhaps explain or give an example?
} | ||
|
||
public void visit(DuckDBTypeCast expr) { | ||
if (expr.getExpression() instanceof DuckDBNullConstant) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we are re-interpreting the AST here (e.g., if it's a NULL
, we ignore the cast). In the other parts, I think we treat the AST as a ground truth (i.e., we just directly print its structure). Would it be possible to construct the AST earlier on correctly, by omitting casts if we have null
values or if the type is unknown
?
innerQuery.setFetchColumns(Arrays.asList(innerQueryFetchColumn)); | ||
} | ||
|
||
// for (Map.Entry<Integer, DuckDBCompositeDataType> entry: idxTypeMap.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove such commented-out code?
return true; | ||
} else { | ||
System.out.printf("Wrong option of --coddtest-model, should be one of: RANDOM, EXPRESSION, SUBQUERY"); | ||
System.exit(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using a switch-case on the enum would be clearer here, throwing an AssertionError
in the default case.
// } | ||
for (int i = 0; i < v1Value.size(); ++i) { | ||
if (!v1Value.get(i).equals(v2Value.get(i))) { | ||
String regx = "[+-]*\\d+\\.?\\d*[Ee]*[+-]*\\d+"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this regex do? We should document this.
String regx = "[+-]*\\d+\\.?\\d*[Ee]*[+-]*\\d+"; | ||
Pattern pattern = Pattern.compile(regx); | ||
if (pattern.matcher(v1Value.get(i)).matches() && pattern.matcher(v2Value.get(i)).matches()) { | ||
if (!v1Value.get(i).substring(0, 6).equals(v2Value.get(i).substring(0, 6))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the 6 come from? It would be good to document this.
return joinExpressions; | ||
} | ||
|
||
private List<DuckDBExpression> genOrderBys(DuckDBExpression specificCondition) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have similar logic in the expression generator of DuckDB? If so, we can reuse it?
|
||
@Override | ||
public void check() throws Exception { | ||
reproducer = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the logic in this class and method is difficult to understand. It would be ideal to refactor it, for example, by extracting methods and reducing code duplication between CODDTest implementations. However, I also understand that this might be risky, as we would no longer be certain that the implementation still works after refactoring.
Support CODDTest for DuckDB. This is a sub-task of the larger implementation tracked in #1054.