Skip to content

Initialize XGBoost codegen #765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 4, 2019

Conversation

Yancey0623
Copy link
Collaborator

@Yancey0623 Yancey0623 commented Sep 3, 2019

Initliaze XGBoost codegen, this PR reuse db_generator to generate XGBoost DMatrix object, part work of #749 .

sperlingxx
sperlingxx previously approved these changes Sep 3, 2019
Copy link
Contributor

@sperlingxx sperlingxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -795,7 +795,7 @@ func xgCreatePredictionTable(pr *extendedSelect, r *antXGBoostFiller, db *DB) er
return nil
}

func genXG(w io.Writer, pr *extendedSelect, ds *trainAndValDataset, fts fieldTypes, db *DB) error {
func genAntXGboost(w io.Writer, pr *extendedSelect, ds *trainAndValDataset, fts fieldTypes, db *DB) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about renaming it to genAntXGBoost?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that's a typo, thx.

ValidationDatasetSQL string
TrainCfg *xgbTrainConfig
Features []*featureMeta
Label *featureMeta
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found our filler for tensorflow in codegen.go,

sqlflow/sql/codegen.go

Lines 67 to 69 in 9a0dc86

X []*featureMeta
FeatureColumnsCode map[string][]string
Y *featureMeta

Features named to X and Label to Y.
I would suggest such consistent naming.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better way is reuse the filler in codegen, would optimize it in the next PR.

ParamsAttr map[string]*attribute
}

func resolveXGBTrainClause(tc *trainClause) (*resolvedXGBTrainClause, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try put attribute extraction into the codegen for each submitter?

Copy link
Collaborator Author

@Yancey0623 Yancey0623 Sep 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to resolve XGBoost parameters, the current expression_resover can only resolve the Tensorlfow parameters, maybe we can refactor the expression_resolver to extract getBoolAttr, getIntAttr as the common function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only have these two parameters currently, try all put it in expression_resolver.go and refine this later. We should let each codegen to deal with it's only attributes.

@weiguoz weiguoz self-requested a review September 3, 2019 11:55
weiguoz
weiguoz previously approved these changes Sep 3, 2019
Copy link
Collaborator

@weiguoz weiguoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ParamsAttr map[string]*attribute
}

func resolveXGBTrainClause(tc *trainClause) (*resolvedXGBTrainClause, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only have these two parameters currently, try all put it in expression_resolver.go and refine this later. We should let each codegen to deal with it's only attributes.

@Yancey0623
Copy link
Collaborator Author

If we only have these two parameters currently, try all put it in expression_resolver.go and refine this later. We should let each codegen to deal with it's only attributes.

@typhoonzero , there is not only two parameters, and I removed the xgb resolver from this PR, would refactor the expression_resolver.go in the next PR.

@Yancey0623 Yancey0623 merged commit 105e70c into sql-machine-learning:develop Sep 4, 2019
@Yancey0623 Yancey0623 deleted the xgb_codegen branch September 4, 2019 07:15
@Yancey0623 Yancey0623 mentioned this pull request Sep 5, 2019
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants