Skip to content

ENH: Release crackfortran as a standalone tool for parsing Fortran. #25307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dmikushin
Copy link

@dmikushin dmikushin commented Dec 4, 2023

The crackfortran tool is developed to be a backend for the f2py utility. However, crackfortran could also be very useful on its own.

The use of crackfortran could be demonstrated by the following example:

► crackfortran symbol.f90 -show
Reading fortran codes...
        Reading file 'symbol.f90' (format:free)
Post-processing...
        Block: symbol
Post-processing (stage 2)...
[{'args': ['x', 'y', 'symx', 'opt'],
  'block': 'subroutine',
  'body': [],
  'common': {'pltdat': ['model', 'ploter'], 'symbls': ['nsym', 'symbl']},
  'commonvars': ['model', 'ploter', 'nsym', 'symbl'],
  'entry': {},
  'externals': [],
  'from': 'symbol.f90',
  'implicit': None,
  'interfaced': [],
  'name': 'symbol',
  'sortvars': ['model', 'nsym', 'ploter', 'symbl', 'opt', 'x', 'y', 'symx'],
  'vars': {'model': {'attrspec': [], 'typespec': 'integer'},
           'nsym': {'attrspec': [], 'typespec': 'integer'},
           'opt': {'typespec': 'integer'},
           'ploter': {'attrspec': [], 'typespec': 'integer'},
           'symbl': {'attrspec': [],
                     'dimension': ['20', '2'],
                     'typespec': 'integer'},
           'symx': {'attrspec': [], 'dimension': ['2'], 'typespec': 'integer'},
           'x': {'typespec': 'real'},
           'y': {'typespec': 'real'}}}]

The tool presents the subroutine declaration statements as a JSON that can be consumed by other tools for further analysis.

Everyone who is looking for the Fortran parser should first come across crackfortran, before wasting time on many sloppy half-working parsers on the Internet. In this regard, the importance of crackfortran is currently misrepresented by keeping it in the depth of numpy internals. Therefore, this patch adds changes required to expose crackfortran as an entry point, next to f2py3 and reveal it to the world.

…ran declarations

The `crackfortran` tool is developed to be a backend for the f2py utility. However, crackfortran could also be very useful on its own.

The use of crackfortran could be demonstrated by the following example:

```json
► crackfortran symbol.f90 -show
Reading fortran codes...
        Reading file 'symbol.f90' (format:free)
Post-processing...
        Block: symbol
Post-processing (stage 2)...
[{'args': ['x', 'y', 'symx', 'opt'],
  'block': 'subroutine',
  'body': [],
  'common': {'pltdat': ['model', 'ploter'], 'symbls': ['nsym', 'symbl']},
  'commonvars': ['model', 'ploter', 'nsym', 'symbl'],
  'entry': {},
  'externals': [],
  'from': 'symbol.f90',
  'implicit': None,
  'interfaced': [],
  'name': 'symbol',
  'sortvars': ['model', 'nsym', 'ploter', 'symbl', 'opt', 'x', 'y', 'symx'],
  'vars': {'model': {'attrspec': [], 'typespec': 'integer'},
           'nsym': {'attrspec': [], 'typespec': 'integer'},
           'opt': {'typespec': 'integer'},
           'ploter': {'attrspec': [], 'typespec': 'integer'},
           'symbl': {'attrspec': [],
                     'dimension': ['20', '2'],
                     'typespec': 'integer'},
           'symx': {'attrspec': [], 'dimension': ['2'], 'typespec': 'integer'},
           'x': {'typespec': 'real'},
           'y': {'typespec': 'real'}}}]
```

The tool presents the subroutine declaration statements as a JSON that can be consumed by other tools for further analysis.

Therefore, this patch adds changes required to expose crackfortran as an entry point, next to f2py3.
@dmikushin dmikushin force-pushed the crackfortran_release branch from 9bc04b9 to ee1036e Compare December 4, 2023 14:10
@charris
Copy link
Member

charris commented Dec 4, 2023

This would be a good topic to raise on the numpy mailing list.

@charris charris changed the title Release the crackfortran as a standalone tool for parsing Fortran declarations. ENH: Release crackfortran as a standalone tool for parsing Fortran. Dec 4, 2023
@dmikushin
Copy link
Author

This would be a good topic to raise on the numpy mailing list.

@charris , could you please help with this?

@charris
Copy link
Member

charris commented Dec 4, 2023

You can subscribe to the mailing list at https://mail.python.org/accounts/signup/?next=/mailman3/lists/numpy-discussion.python.org/. Then just send an email to numpy-discussion@python.org. You can also open a numpy issue with the suggestion.

@rgommers
Copy link
Member

rgommers commented Dec 8, 2023

@HaoZeke does this seem reasonable, or is (for example) LFortran already exposing a more maintainable and robust parser?

@dmikushin
Copy link
Author

dmikushin commented Dec 8, 2023

@rgommers , sincerely any rich compiler-level technology is irrelevant to this call. The beauty of crackfortran is in its simplicity. Crackfortran parses the source (actually, the interface part only, not the code!) unarmed, only by regex matching. This approach is more oriented on extracting analytics from the syntactically-correct source. Unlike that, compiler parser has to be precise about may possible error states to assist correction of malformed code. Furthermore, extracting a JSON usable by a non-compiler person from a yacc/bison-generated AST would be another tough adventure. So my point is: these are two different worlds. There are cars and carts, and good cars are not obsoleting the joy of carting.

Copy link
Member

@HaoZeke HaoZeke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @dmikushin. I have a technical problem, which is that this needs to follow #24552 and use the argparse preparse for this.

I have some other thoughts which I'll comment with.

@HaoZeke
Copy link
Member

HaoZeke commented Dec 8, 2023

@HaoZeke does this seem reasonable, or is (for example) LFortran already exposing a more maintainable and robust parser?

LFortran has a more robust parser, but it is still less complete than that of f2py at the moment. It will take time. Perhaps more importantly, at the moment the development is such that a dependency on lfortran is quite heavy, and would include LLVM. This could also be fixed, but I am not sure when it would be feasible to split the parser development into a separate project. Perhaps @certik would be able to give more insight.

With regards to @dmikushin's eloquent comment:

... So my point is: these are two different worlds. There are cars and carts, and good cars are not obsoleting the joy of carting.

I actually tend to agree. @pearu has mentioned on several occasions that F2PY doesn't need to penalize the user for not constructing correct Fortran (also F2PY takes the !f2py directives and usercode which LFortran would need a plugin ecosystem to support).

The cleanest route to couple LFortran and F2PY in the near future would be to generate F2PY's crackfortran style input from LFortran's ASR which would then guarantee semantic correctness while keeping the backend from F2PY agnostic to changes. Since LFortran also can generate JSON representations, it would be a good place to start integrating the two in any case.

Some other projects also use crackfortran (e.g. sphinx-fortran) so I guess it could be a good idea to expose it.

However some more thoughts for @dmikushin:

  • Would need documentation
  • Use cases / examples
  • AFAIK we use recursive dictionaries, do they have a good representation in JSON?

In principle, letting users clean up crackfortran's output on their own is a good idea, it could be paired with the hooks @pearu introduced in #19388.

@HaoZeke
Copy link
Member

HaoZeke commented Nov 10, 2024

@dmikushin is this still relevant?

@dmikushin
Copy link
Author

I'm still interested!

@pearu
Copy link
Contributor

pearu commented Nov 14, 2024

Just some data points:

  • crackfortran is not a Fortran parser, LFortran is. One could say that crackfortran is a Fortran "parser" that parses only Fortran declarations, crackfortran does not parse Fortran execution statements (and never will as these are irrelevant for f2py tasks).
  • crackfortran is able to "parse" Fortran 77 (as well as 90 and newer) while LFortran does not support Fortran 77. This (among other things) basically defines why LFortran-based tools cannot replace f2py.

Regarding releasing crackfortran as a standalone tool: I am not sure if it is worth it just for the sake of using crackfortran as a tool. Some parts of crackfortran depend on numpy, so you would need to have numpy available anyway for your software (update: I misunderstood the aim of this PR). Instead, I would recommend reviving https://github.com/pearu/f2py that provides a true Fortran parser in Python that supports both fixed and free formats (hence, also Fortran 77). But be warned, it will require a dedication of more than just a couple of months.

@certik
Copy link
Contributor

certik commented Nov 14, 2024

while LFortran does not support Fortran 77.

LFortran supports F77.

@pearu
Copy link
Contributor

pearu commented Nov 14, 2024

while LFortran does not support Fortran 77.

LFortran supports F77.

[OT Warning]
LFortran supports fixed format Fortran which is not the same as F77. For instance, lfortran gives syntax error to

      subroutinefoo
      end

which is a valid Fortran 77 source code. A true Fortran 77 parser cannot rely on nowadays tokenizers that LFortran uses. Hence my claim.
[End of OT Warning]

@HaoZeke
Copy link
Member

HaoZeke commented Nov 14, 2024

AFAIK LFortran plans to support all of F77, but is continually in development, so eventually it will get there.

https://github.com/pearu/f2py is a good place to look, but the frontend (CLI) should follow the newer f2pyarg design (from #21923)... I'd be happy to mentor someone working on upgrading the parser in-place, but likely won't have too much time myself in the near future (would be happy to be proven wrong of course).

Practically speaking of course, one of the current highest barriers to major changes within F2PY is the lack of existing test coverage within NumPy and in the larger context of parsing Fortran code.

@dmikushin could you clarify the scope of your proposed work? Was the intent just to move the existing crackfortran code into a callable module? Or is this a discussion from which a larger project is meant to emerge?

@certik
Copy link
Contributor

certik commented Nov 14, 2024

For instance, lfortran gives syntax error to

You have to tell LFortran to use its fixed-form parser:

$ lfortran --fixed-form a.f --show-ast
(TranslationUnit
    [(Subroutine
        foo
        []
        []
        ()
        ()
        []
        []
        []
        []
        []
        []
        []
    )]
)

LFortran supports fixed-form, for example it can parse all of SciPy correctly.

If you find any bug, you can report it, but overall LFortran supports F77 and fixed-form. You can contact me offline if you want to check with me before posting online what LFortran can or cannot do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants