Skip to content

[MRG + 1] DOC add sklearn-crfsuite to related projects #7878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 28, 2016

Conversation

kmike
Copy link
Contributor

@kmike kmike commented Nov 15, 2016

What do you think about adding sklearn-crfsuite to related projects?

It is a https://github.com/chokkan/crfsuite wrapper which provides a CRF estimator with API similar to scikit-learn (compatible with e.g. sklearn model selection utilities). There are also some sklearn-compatible metrics and scorers for sequence classification tasks (though not many).

Another simiilar package is https://github.com/jakevdp/pyCRFsuite, but it is unmaintained.

@jnothman
Copy link
Member

LGTM. I'd really like to see related projects, vis a vis projects with a compatible/similar API, becoming a more freely managed catalogue to emphasise that scikit-learn is (in one of its manifestations) an ecosystem. Any ideas how best to achieve that?

@kmike
Copy link
Contributor Author

kmike commented Nov 15, 2016

Options:

  • Github wiki page. Pros: anyone can edit it, so there can be more contributions. Cons: less moderation, links can become broken. Discoverability may be solved by adding a prominent link from scikit-learn docs.
  • something like "awesome-sklearn" repository, similar to zillions of other "awesome-.." repositories. Pros/cons: ? I'm not an user of such repositories, but maybe it can be easier for people to subscribe for updates by watching the repository. With wiki or doc page you need to check it from time to time in order to find new related packages. Cons: 'awesome-...' repository name is lame.
  • improve existing "related projects" page: add a call for adding more projects right at http://scikit-learn.org/stable/related_projects.html page, with instructions (please send a PR if you know a project with an API similar/compatible to sklearn and you want it to be listed...). Add a more prominent link to this page. Pros: moderation. Cons: ?

@raghavrv raghavrv changed the title DOC add sklearn-crfsuite to related projects [MRG + 1] DOC add sklearn-crfsuite to related projects Nov 15, 2016
@raghavrv
Copy link
Member

raghavrv commented Nov 15, 2016

something like "awesome-sklearn" repository

Maybe in scikit-learn-contrib we could add a repository "related-projects" with a README filled with such links?

@kmike
Copy link
Contributor Author

kmike commented Nov 15, 2016

@raghavrv Having it in scikit-learn-contrib organization makes sense: not all packages can be moved to scikit-learn-contrib organization even if they satisfy all conditions - e.g. a company may want to keep a repo under its name, or someone may disagree with giving commit and admin access to a third party even if it is a highly trusted third party.

@raghavrv
Copy link
Member

Indeed. I'm suggesting just for the links. (Instead of a new org/repository 'related-projects' or 'other-projects).

I'm suggesting so because scikit-learn-contrib is now a sizeable collection of "related projects"... Adding a repository there with a readme linking all other related projects that may not be moved there would complete it as a collection of all projects / API wrappers related to sklearn...

@jnothman
Copy link
Member

Other thing you've subtly pointed out to me is that we don't want this to
be constrained to /stable/ so at a minimum we need a separate CI process to
publish changes to it. (or use github wiki)

On 15 November 2016 at 22:50, Mikhail Korobov notifications@github.com
wrote:

Options:

  • Github wiki page. Pros: anyone can edit it, so there can be more
    contributions. Cons: less moderation, links can become broken.
    Discoverability may be solved by adding a prominent link from scikit-learn
    docs.
  • something like "awesome-sklearn" repository, similar to zillions of
    other "awesome-.." repositories. Pros/cons: ? I'm not an user of such
    packages, but maybe it can be easier for people to subscribe for updates by
    watching the repository. With wiki or doc page you need to check it from
    time to time in order to find new related packages. Cons: 'awesome-...'
    repository name is lame.
  • improve existing "related projects" page: add a call for adding more
    projects right at http://scikit-learn.org/stable/related_projects.html
    page, with instructions (please send a PR if you know a project with an API
    similar/compatible to sklearn and you want it to be listed...). Add a more
    prominent link to this page. Pros: moderation. Cons: ?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#7878 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz622TbtXBunUtvo1TNtypea5Z1XPaks5q-ZyJgaJpZM4Kyacd
.

@jnothman
Copy link
Member

Is it too unfettered to let the wiki store a YaML listing that could then
have an update hook that copies it to the web server (with checks and
balances to ensure it's reasonable size and format) for display/search with
a DataTable or similar?

On 16 November 2016 at 08:07, Joel Nothman joel.nothman@gmail.com wrote:

Other thing you've subtly pointed out to me is that we don't want this to
be constrained to /stable/ so at a minimum we need a separate CI process to
publish changes to it. (or use github wiki)

On 15 November 2016 at 22:50, Mikhail Korobov notifications@github.com
wrote:

Options:

  • Github wiki page. Pros: anyone can edit it, so there can be more
    contributions. Cons: less moderation, links can become broken.
    Discoverability may be solved by adding a prominent link from scikit-learn
    docs.
  • something like "awesome-sklearn" repository, similar to zillions of
    other "awesome-.." repositories. Pros/cons: ? I'm not an user of such
    packages, but maybe it can be easier for people to subscribe for updates by
    watching the repository. With wiki or doc page you need to check it from
    time to time in order to find new related packages. Cons: 'awesome-...'
    repository name is lame.
  • improve existing "related projects" page: add a call for adding
    more projects right at http://scikit-learn.org/stable
    /related_projects.html page, with instructions (please send a PR if
    you know a project with an API similar/compatible to sklearn and you want
    it to be listed...). Add a more prominent link to this page. Pros:
    moderation. Cons: ?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#7878 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz622TbtXBunUtvo1TNtypea5Z1XPaks5q-ZyJgaJpZM4Kyacd
.

@mblondel
Copy link
Member

Github wiki page

We do have this page.

to emphasise that scikit-learn is (in one of its manifestations) an ecosystem

Yes and so scikit-learn should focus more on solving the hard problems (model selection, sample weights, etc).

@amueller
Copy link
Member

If we put this into scikit-learn-contrib we don't need an additional CI process.
And I like the idea to put it there. I'm not sure about additional sub-repo vs just putting it into https://github.com/scikit-learn-contrib/scikit-learn-contrib (and maybe creating a website for that)

@amueller
Copy link
Member

👍 to merge this now, add the "related projects" to "scikit-learn-contrib" and then just link to there from here.

@raghavrv
Copy link
Member

I'm not sure about additional sub-repo

The advantage of a dedicated repo either under scikit-learn or scikit-learn-contrib is that it reaches a broader audience by forks and stars. Making a PR to add their project is also much easier. You can see most of the awesome-* repos are quite useful in presenting all related stuff together in a nice page like this one for all machine learning related stuff and this one for tensorflow. You won't even be needing a dedicated webpage per se...

I am actually wondering if we should simply name it awesome-sklearn. I can understand that its quite lame but it seems to be quite popular with that name. And as in above links we could also link all the (sci/*)py(data/con) talks/tutorials, books and nice projects (not just libraries) related to scikit learn or the scikit API.

@amueller
Copy link
Member

The advantage of a dedicated repo either under scikit-learn or scikit-learn-contrib is that it reaches a broader audience by forks and stars.

I don't understand. Why is that easier for a related-projects repo than for the scikit-learn-contrib repo?

@raghavrv
Copy link
Member

The contrib is for projects that are hosted in that repo. The related-projects? would be about anything and everything related to sklearn. (Books / Popular blogs / Kaggle pages (maybe?) and ofcourse other projects too...)... These things get croud sourced pretty soon...

I was expecting scikit-contrib itself to be one such link inside that repo...

@amueller
Copy link
Member

Yeah but why does that need a separate repo instead of a section in the readme? that seems overkill

@mblondel
Copy link
Member

Yeah but why does that need a separate repo instead of a section in the readme

I'd rather not clutter the README file with non-contrib projects. And if we start referencing non-contrib projects, what's the benefit of becoming a contrib project?

I don't feel that we receive so many related-project PRs that we need to rethink our workflow.

@raghavrv
Copy link
Member

👍 to merge this now, add the "related projects" to "scikit-learn-contrib" and then just link to there from here.

I'm merging this for now. We can have a separate discussion on this if needed...

@raghavrv raghavrv merged commit d97d13e into scikit-learn:master Dec 28, 2016
@raghavrv
Copy link
Member

Thanks @kmike

sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
@Przemo10 Przemo10 mentioned this pull request Mar 17, 2017
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants