-
Notifications
You must be signed in to change notification settings - Fork 704
Initialize XGBoost codegen #765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialize XGBoost codegen #765
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sql/codegen_ant_xgboost.go
Outdated
@@ -795,7 +795,7 @@ func xgCreatePredictionTable(pr *extendedSelect, r *antXGBoostFiller, db *DB) er | |||
return nil | |||
} | |||
|
|||
func genXG(w io.Writer, pr *extendedSelect, ds *trainAndValDataset, fts fieldTypes, db *DB) error { | |||
func genAntXGboost(w io.Writer, pr *extendedSelect, ds *trainAndValDataset, fts fieldTypes, db *DB) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about renaming it to genAntXGBoost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that's a typo, thx.
ValidationDatasetSQL string | ||
TrainCfg *xgbTrainConfig | ||
Features []*featureMeta | ||
Label *featureMeta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found our filler for tensorflow in codegen.go
,
Lines 67 to 69 in 9a0dc86
X []*featureMeta | |
FeatureColumnsCode map[string][]string | |
Y *featureMeta |
Features
named to X
and Label
to Y
.I would suggest such consistent naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a better way is reuse the filler in codegen
, would optimize it in the next PR.
sql/expression_resolver_xgb.go
Outdated
ParamsAttr map[string]*attribute | ||
} | ||
|
||
func resolveXGBTrainClause(tc *trainClause) (*resolvedXGBTrainClause, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try put attribute extraction into the codegen for each submitter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used to resolve XGBoost parameters, the current expression_resover
can only resolve the Tensorlfow parameters, maybe we can refactor the expression_resolver
to extract getBoolAttr
, getIntAttr
as the common function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only have these two parameters currently, try all put it in expression_resolver.go
and refine this later. We should let each codegen to deal with it's only attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sql/expression_resolver_xgb.go
Outdated
ParamsAttr map[string]*attribute | ||
} | ||
|
||
func resolveXGBTrainClause(tc *trainClause) (*resolvedXGBTrainClause, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only have these two parameters currently, try all put it in expression_resolver.go
and refine this later. We should let each codegen to deal with it's only attributes.
@typhoonzero , there is not only two parameters, and I removed the xgb resolver from this PR, would refactor the |
Initliaze XGBoost codegen, this PR reuse
db_generator
to generate XGBoost DMatrix object, part work of #749 .