Request for project inclusion: Helstrom Quantum Centroid Classifier #44

leockl · 2020-01-24T11:11:55Z

Request for project inclusion in scikit-learn-contrib

Project name: helstrom-quantum-centroid-classifier
Project description: A package which implements a quantum-inspired supervised classification approach for data with binary classes
Authors: Leo Chow, Giuseppe Sergioli, Roberto Giuntini
Current repository: https://github.com/leockl/helstrom-quantum-centroid-classifier
Requirements:
scikit-learn compatible (check_estimator passed)
Documentation (guide, API reference, example gallery)
Unit tests (coverage: 90%)
Python3 compatible
PEP8 compliant
Continuous integration

adrinjalali · 2020-01-28T14:04:35Z

Looks like a descent one to me :)

leockl · 2020-01-29T00:53:56Z

Thank you @adrinjalali for reviewing this and for the thumbs up!

chkoar · 2020-03-05T20:35:12Z

Great package. I would use all package names, file names and variable names in snake_lower case.

chkoar · 2020-03-05T20:39:03Z

Couldn't we use a centroids_ array instead of both centroid_class_0_ and centroid_class_1 to stay in line with NearestCentroid?

leockl · 2020-03-09T06:35:39Z

Thanks for the upvote @chkoar. We will get back to you on the two feedback you have raised soon, as we are also in the process of updating the package with some new features.

chkoar · 2020-05-30T19:35:22Z

Any update on this?

leockl · 2020-06-01T12:48:24Z

Hey guys, sorry I have been caught up with a few too many things the last couple of weeks. I will get this completed by the end of this week.

My upmost sincere apologies for the delay.

leockl · 2020-06-07T12:55:26Z

Hi @chkoar, the package have now been updated:

package names, file names and variable names in snake_lower case - with the exception of the variable feature matrix X and variable names using X, to be in line with Scikit-learn naming convention and readability purposes.
centroids_ array is now used instead of both centroid_class_0_ and centroid_class_1 to stay in line with NearestCentroid.

Thank you and looking forward to your response.

chkoar · 2020-06-07T15:12:08Z

Hey @leockl ,

I would named it as centroids_, as you said. Not centroid_, as it is named now.
It seems that the classifier can cope only with binary problems, right? Could you detect that and warn the user? It seems that estimators tags is the way to go.
In the scikit-learn ecosystem is not common to have constructors in lower case, no? So, I would name this line to HQC

leockl · 2020-06-08T01:28:17Z

Hi @chkoar, yes for now the classifier can only cope with binary problems only. I have already included this warning in the 1st paragraph in the README.md file and in the 1st paragraph in the documentation. The estimator tag is also already included in the source code here. Are these enough? Otherwise, I can include a few more lines of code to produce a warning when there are more than 2 classes.

I will update centroid_ to centroids_ and change the class constructor back to HQC.

Many thanks.

leockl · 2020-07-13T05:54:52Z

Hi @chkoar, could you please provide your updated feedback on the estimators tags that you have mentioned above?

leockl · 2020-07-26T10:16:06Z

Hi @chkoar, I have updated the package with your previous two suggestions:

updated centroid_ to centroids_.
changed the class constructor to capitals HQC.

I haven't heard back from you in awhile and I haven't heard back from you about the estimator tag that I have mentioned above if it's good enough. Could you please have a look and complete this. This package has been sitting here for quite awhile now.

Many thanks in advance!

chkoar · 2020-07-27T08:08:40Z

Hey @leockl,

The estimator tag is also already included in the source code here.

I had the impression that the check estimator was testing the case where your have a binary classifier and you are passing a non-binary problem. But as I can see this is not the case.

Hi @chkoar, yes for now the classifier can only cope with binary problems only.

IINM as a user I would expect as many probas (from predict_proba) as the classes of the problem.
So, I suppose that you should raise an exception. Something similar like the following:

if len(self.classes_) > 2:
    raise ValueError('Only 2 classes are supported')

Apart from this, I would add test cases for the HQC learnt attributes. For instance check the centroid's values on a known problem or something like this. Also, it seems to me that requirements.txt contains redundant dependencies, no? Also setup.py is missing install_requires.

ping @rth @glemaitre @adrinjalali

leockl · 2020-08-04T07:02:19Z

Hi @chkoar,

Many thanks for your response. I will update all the suggested requests. With the request on adding test cases for the HQC learnt attributes, do you have any examples or guidelines on how this is done?

chkoar · 2020-08-07T13:31:17Z

@leockl could you take a look here?

leockl · 2020-09-02T04:00:11Z

Hi @chkoar,

My sincerely apologies for the delay on my reply. I have now updated the package with your suggestions:

added the exception error when there are more than 2 classes.
added test cases for the HQC learnt attribute hels_bound_.
removed redundant dependencies in requirements.txt.
included install_requires in setup.py.

Please review. Many thanks in advance.

leockl · 2020-11-01T01:17:31Z

Hi @chkoar, could I please get an update on this? Many thanks.

chkoar · 2020-11-16T07:18:11Z

@leockl sorry for the delayed reply.

Below, I list some minors (probably) since I cannot open issues in your fork.

It seems that this block and this are identical. You could refactor in order to reduce the LOC.
'ij,ji->' appears three times. It could be a constant.
In this and this line we have a self-assignment. Is there a reason?
Is u being reused in this and this lines? If not I would change it to _.
Is n being reused in this line? If not I would change it to _.

Apart of that, as @adrinjalali said, looks like a decent one.

@rth @glemaitre could this estimator be part of scikit-learn-extra repo? In any case it would be nice if any other folk could review this package.

leockl · 2020-11-20T06:19:09Z

Hi @chkoar,

Need to clarify a few points with you:

2nd point: What do you mean by it could be a constant? Could you show an example.
3rd point: This is correct and it's part of the algorithm when n_copies == 1.
4th point: What do you mean by change it to _. Could you show an example.
5th point: I guess this would be the same as the 4th point above.

Could I confirm if you require anymore changes @chkoar?

@rth @glemaitre Please let us know if you have any feedback? If not I would hope this would be the final changes that is required.

Please let me know. Many thanks in advance!

chkoar · 2020-11-20T15:23:56Z

2nd point: What do you mean by it could be a constant? Could you show an example

subscripts = "ij,ji->"

4th point: What do you mean by change it to _. Could you show an example.

Among others, the underscore variable it's a convention for a "throwaway" variable. A variable that won't be used next. Common use cases are iterations and the unpackings.

alist = ["one", "two", "three"]

for index, _ in enumerate(alist):
    print(index)

leockl · 2020-12-09T09:01:00Z

Hi @chkoar,

Many thanks for the above. I have now updated the code according to your change requests:

It seems that this block and this are identical. You could refactor in order to reduce the LOC.

Done.

'ij,ji->' appears three times. It could be a constant.

Done.

In this and this line we have a self-assignment. Is there a reason?

This is correct and it's part of the algorithm when n_copies == 1. Please refer to the research paper.

Is u being reused in this and this lines? If not I would change it to _.

Done.

Is n being reused in this line? If not I would change it to _.

This was an error on my part. n is not reused in predict_proba(), therefore I have now removed n here.

I have also updated the code from using check_X_y() and check_array() to self._validate_data() as required by scikit-learn v0.25.

Please find the updated code here:
https://github.com/leockl/HQC/blob/master/Christos/HQC%20-%20CPU%20-%20Christos.py

I haven't uploaded the updated code into pip to release a new version yet. Would be great for you to approve all final changes first, as it takes time to keep releasing a new version every time new changes are requested. I hope you could understand. Many thanks once again Christos.

leockl changed the title ~~Request for Project Inclusion: Helstrom Quantum Centroid Classifier~~ Request for project inclusion: Helstrom Quantum Centroid Classifier Jan 24, 2020

chkoar mentioned this issue Jul 29, 2020

check_estimator and binary_only tag scikit-learn/scikit-learn#18005

Closed

Request for project inclusion: Helstrom Quantum Centroid Classifier #44

Request for project inclusion: Helstrom Quantum Centroid Classifier #44

Comments

leockl commented Jan 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

adrinjalali commented Jan 28, 2020

Uh oh!

leockl commented Jan 29, 2020

Uh oh!

chkoar commented Mar 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chkoar commented Mar 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leockl commented Mar 9, 2020

Uh oh!

chkoar commented May 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leockl commented Jun 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leockl commented Jun 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chkoar commented Jun 7, 2020

Uh oh!

leockl commented Jun 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leockl commented Jul 13, 2020

Uh oh!

leockl commented Jul 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chkoar commented Jul 27, 2020

Uh oh!

leockl commented Aug 4, 2020

Uh oh!

chkoar commented Aug 7, 2020

Uh oh!

leockl commented Sep 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leockl commented Nov 1, 2020

Uh oh!

chkoar commented Nov 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leockl commented Nov 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chkoar commented Nov 20, 2020

Uh oh!

leockl commented Dec 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leockl commented Jan 24, 2020 •

edited

Loading

chkoar commented Mar 5, 2020 •

edited

Loading

chkoar commented Mar 5, 2020 •

edited

Loading

chkoar commented May 30, 2020 •

edited

Loading

leockl commented Jun 1, 2020 •

edited

Loading

leockl commented Jun 7, 2020 •

edited

Loading

leockl commented Jun 8, 2020 •

edited

Loading

leockl commented Jul 26, 2020 •

edited

Loading

leockl commented Sep 2, 2020 •

edited

Loading

chkoar commented Nov 16, 2020 •

edited

Loading

leockl commented Nov 20, 2020 •

edited

Loading

leockl commented Dec 9, 2020 •

edited

Loading