The Wayback Machine - https://web.archive.org/web/20201128041506/https://github.com/dotnet/machinelearning/issues/3985
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FieldAwareFactorizationMachine to AutoML #3985

Open
justinormont opened this issue Jul 10, 2019 · 5 comments
Open

Add FieldAwareFactorizationMachine to AutoML #3985

justinormont opened this issue Jul 10, 2019 · 5 comments

Comments

@justinormont
Copy link
Member

@justinormont justinormont commented Jul 10, 2019

FieldAwareFactorizationMachine is good for large dataset like the Criteo 1TB dataset.

Currently FieldAwareFactorizationMachine is not swept over in AutoML.

Task:

  • Add trainer to default list of binary learners to try
  • Add sweep range
  • Add to CLI's C# CodeGen

Should be easy to just replicate an existing trainer like SDCA:

internal class SdcaLogisticRegressionBinaryExtension : ITrainerExtension
{
public IEnumerable<SweepableParam> GetHyperparamSweepRanges()
{
return SweepableParams.BuildSdcaParams();
}
public ITrainerEstimator CreateInstance(MLContext mlContext, IEnumerable<SweepableParam> sweepParams,
ColumnInformation columnInfo)
{
var options = TrainerExtensionUtil.CreateOptions<SdcaLogisticRegressionBinaryTrainer.Options>(sweepParams, columnInfo.LabelColumnName);
return mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(options);
}
public PipelineNode CreatePipelineNode(IEnumerable<SweepableParam> sweepParams, ColumnInformation columnInfo)
{
return TrainerExtensionUtil.BuildPipelineNode(TrainerExtensionCatalog.GetTrainerName(this), sweepParams,
columnInfo.LabelColumnName);
}
}
@zHaytam
Copy link

@zHaytam zHaytam commented Aug 25, 2019

Hi, I would like to take this issue if possible.

@justinormont
Copy link
Member Author

@justinormont justinormont commented Aug 25, 2019

Sounds great.

This will be a change to the master branch as the AutoML API code has been merged into master. The CLI's CodeGen (which creates the C# project) still lives in the AutoML feature branch.

Perhaps @XiaroanZhang can create the CodeGen part, or walk you though what's needed. @daholste may also be able to offer advice. I'll be out for the next week, but feel free to submit a PR and have folks approve/merge in.

Thanks so much.

@gokart23
Copy link

@gokart23 gokart23 commented Oct 6, 2019

Hi @justinormont, looks like this issue is still open - do you mind if I work on it?

@mstfbl
Copy link
Member

@mstfbl mstfbl commented Jan 9, 2020

Hi @gokart23 and @zHaytam , if you guys are still interested, go for it!

@mstfbl mstfbl added the P2 label Jan 9, 2020
@Sandy4321
Copy link

@Sandy4321 Sandy4321 commented Jan 26, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.