Skip to content

ENH: Python as a templating language(PyAS) #17952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

seiko2plus
Copy link
Member

Implements a simple template engine that simplifying the use of
Python in generating C code.

The way it works is very similar to the PHP language, so all you have to do is writing the
Python code between special tags. e.g. <? print("Hello World!") ?>.

The new template requires a new extensions '.pyas' so it can be recognized
by the distutils.

@eric-wieser
Copy link
Member

What's the rationale for rolling our own, instead of using something like jinja2?

@seiko2plus
Copy link
Member Author

@eric-wieser, there are not friendly with C language and most of them focus on avoiding the use of python while PyAS brings
python instead, it makes sense since they're mainly designed for html
for example jinja2 use curly braces to define the statements which increases the readability of the C code.

@eric-wieser
Copy link
Member

eric-wieser commented Dec 7, 2020

for example jinja2 use curly braces to define the statements which increases the readability of the C code.

jinja2 lets you pick whatever separator you like for the jinja markers - {{ and }} are just the defaults.

@seiko2plus
Copy link
Member Author

@eric-wieser, Yes that's right, missed that, Is jinja2 allows executing python code?

@eric-wieser
Copy link
Member

It doesn't allow executing arbitrary code, but it lets you pass arbitrary functions in from the file running the template.

@seiko2plus
Copy link
Member Author

@eric-wieser, how can manage that in a source tree? usually, each template source is a standalone file by itself plus bringing a big template engine and assign python functions in setup.py for one-time use isn't robust.
what PyAS offer is just simplifying the use of Python and it may help us to improve our current python generators files.

@eric-wieser
Copy link
Member

We already have two or three custom templating languages inside numpy. I'm wary of inventing a new one simply because it's one more tool for contributors to numpy to have to learn how to use, and one that they will definitely never have seen before. Maybe jinja is also a bad choice, but I think if there's an existing solution we can use (even if we end up vendoring it like we did with tempita), then it will be easier to onboard new developers onto it.

@seberg
Copy link
Member

seberg commented Dec 7, 2020

How hard would it be to use tempita here as well? With the possible modification of not using curly braces for the C-code, which indeed might not be ideal (although I am not sure it is terrible, since {{ is very rare in C and typical C braces are almost always be followed by a new line in our C code style...

@seiko2plus
Copy link
Member Author

seiko2plus commented Dec 7, 2020

@eric-wieser,

We already have two or three custom templating languages inside numpy.

still we decided to use python instead for generating c source files in many situations,
.e.g. generate_umath.py

I'm wary of inventing a new one simply because it's one more tool for contributors to numpy to have to learn how to use

And that exactly what PyAS brings since it doesn't have any custom syntax just python and f-strings

Maybe jinja is also a bad choice, but I think if there's an existing solution

Almost all existing solution are exist to serve html code, and they are mainly works in realtime apps
which make them slower to use in one-time executions.

@seberg,

How hard would it be to use tempita here as well?

Too easy implement any kind of template engine through PyAS, all we have to do is
to assign new tag for it e.g. <?pita tempita code ?> or even jinja or even
our other template engines but I didn't see a reason to use it over f-strings at least for now

@eric-wieser
Copy link
Member

Ah, I remember something that I'd forgotten before. A really important feature of using a templating language to generate C code is that the generated C code contains #line markers so that compiler errors are expressed in terms of the thing the developer is editing. I believe we do this already with .c.src, but I don't see a mechanism to do that here. I think I remember this being hard in jinja too...

@seiko2plus
Copy link
Member Author

seiko2plus commented Dec 7, 2020

@eric-wieser,

A really important feature of using a templating language to generate C code is that the generated C code contains #line

That was my main concern actually when I wrote this engine, its highly respects the line number, you can get the current line number from the current stack frame or through template utility function lineno()

for example:

static void my_function(...)
{
<% #line {{lineno()}} "{{filename()}}" %>
}

@eric-wieser
Copy link
Member

Right, but for template code like

<% for i in range(10):
    print(<%%void func_{{i}}() {%%>
    if i % 2 == 0:
        print("error;")
    print(<%%}%%>
%>

you want the #line macros to be inserted automatically. Perhaps for your template language it's sufficient to replace every newline output by print with the current line number.

@seiko2plus
Copy link
Member Author

seiko2plus commented Dec 7, 2020

you want the #line macros to be inserted automatically.

That will not be good for the compiler, template .src is literally spamming the compiler by adding
#line before almost each line while all you need is to add it before each loop only.

usually PyAS requires more creativity in organizing the code, for example why not treating
python functions as macros? also template utility provide a function called include() it works
similar to PHP one, it simply include another PyAStemplate file in the scope it can be useful to
hold other extra utility. PyAS also support better mechanism for translating strings which
make it very effective with SIMD code, honestly the motive behind PyAS to me at least is to
find an alternative to c++ templates.

There's misunderstanding here, I guess maybe the document wasn't good enough, tags should be as follows:

  • <? ?> execute python code
  • <% %> performs -> print + friendly f-string
  • <%% %%> performs -> only friendly f-string and only works within <? ?>

so your example should be like following:

<?for I in range(10):
    <%
    #line {{lineno()}}
    void func_$I()
    {
    %>
    if I % 2 == 0: <%
        error;
    %>
    <%
    }
    %>
?>

but why not using f-strings field replacement instead of shuffle tags?

<?
def If(cond, v, v_else=''):
    if cond:
        return v
    return v_else

# note `$I` is a feature in ff-string tags called local on fly
for I in range(10): <%
    #line {{lineno()}}
    void func_$I()
    {
        {{If(I % 2 == 0, 'error;')}}
    }
    %>
?>

or maybe via c pre-processor?

<?for I in range(10): <%
    #line {{lineno()}}
    void func_$I()
    {
        #if {{int(I % 2 == 0)}}
        error;
        #endif
    }
%>?>

@charris
Copy link
Member

charris commented Dec 8, 2020

What's the rationale for rolling our own, instead of using something like jinja2

It isn't that is has never been looked at, but the current templating tools all work better for the C code. I even wrote my own light weight jinja version before jinja was a thing :) However, it would be nice to get rid of the separate f2py code template tool that came in when that project was taken over.

@matthew-brett
Copy link
Contributor

It does seem reasonable to have another look at this problem now Python has f-strings.

  Implements a simple template engine that simplifying the use of
  Python in generating C code.

  The way it works is very similar to the PHP language, so all you have to do is writing the
  Python code between special tags. e.g. `<? print("Hello World!") ?>`.

  The new template requires a new extensions '.pyas' so it can be recognized
  by the distutils.
@charris
Copy link
Member

charris commented Dec 8, 2020

It does seem reasonable to have another look at this problem now Python has f-strings.

IIRC, the main problem was concise specification or the looping variables and nested loops. It didn't look pretty with the standard templating applications. I did fool around with that using a light weight templating application that I made by pulling out parts of Django. Which is also where jinja comes from, but jinja came a few years later.

@seiko2plus
Copy link
Member Author

seiko2plus commented Dec 8, 2020

@charris, There's a high priority need to rework _simd.dispatch.c.src via python(PyAS) what I'm suggestion here or any other altrantive solution the community may see it better, in order to generate a rst document gathering all universal intrinsics.

@charris
Copy link
Member

charris commented Dec 8, 2020

@seiko2plus You could maybe modify the existing template function if the aim is generate docs. Perhaps add a documentation template for the machinery?

@charris
Copy link
Member

charris commented Dec 8, 2020

Could maybe have a /**documentation */ marker, maybe output a *.txt file, or even a jinja template or some such.

@seiko2plus
Copy link
Member Author

seiko2plus commented Dec 8, 2020

@charris, what I need is exectly the following :

  • generate the signature of the intrinsic based on the data types provided to the C macro or python function in case of (PyAS)
  • pass the final doc to C method def
  • add python level to validate the data types before generating the final C function
  • improve line number compiler hint to simplify tracing errors
  • increase the readablity
  • cache the genreated function via C macro to speed up the compilation

I imagine the situation as following:

Create new PyAS utlity template file _simd_easyintrin.inc.pyas instead of the C one _simd_easyintrin.inc.pyas
should contatins all fundamental functions.

Example for how it looks(_simd_easyintrin.inc.pyas)
<?
def cmacro_cache(func):
    """
    decorate for converting the body of C method into macro to cache it.
    """
    pass

def intrin_signature(intrin_name, *args):
    """
    generate a signature for NPYV intrinsic
    """
    pass

def add_intrin_body_1(Name, Tret): 
    """
    Body of intrinsic with no params
    """
    return <%%1
    if (!PyArg_ParseTuple(args, ":$Name")) {
        return NULL;
    }
    simd_arg a = {.dtype = simd_data_$ret, .data = {.$ret = npyv_$Name()} };
    return simd_arg_to_obj(&a);
    %%>

def add_intrin_body_0(Name): 
    """
    Body of void intrinsic with no params
    """
    return <%%1
    if (!PyArg_ParseTuple(args, ":$Name")) {
        return NULL;
    }
    npyv_$Name();
    Py_RETURN_NONE;
    %%>

@cmacro_cache
def add_intrin_body(Name, Tret, *inputs):
    """
    Body of intrinsic with nth of params
    """
    Largs = len(inputs)
    return <%%1
    simd_arg simd_args[$Largs] = {
        {{','.join([f'{.dtype = simd_data_{{t}}}' for t in inputs])}}
    };
    const char *fmt = "{{'O&'*Largs}}:$Name";
    if (!PyArg_ParseTuple(args, fmt,
        {{ ','.join([f'simd_arg_converter, &simd_args[{{i}}]' for i in range(Largs)])}}
    )) {
        return NULL;
    }
    simd_data data;
    #line {{lineno()}} "{{filename()}}"
    data.$Tret = npyv_$Name({{
        ','.join([f'simd_args[{{i}}].data.{{t}}' for i, t in enumerate(inputs)])
    }});
    for (int i = 0; i < $Largs; ++i) {
        simd_arg_free(&simd_args[i]);
    }
    simd_arg ret = {.data = data, .dtype = simd_data_$Tret};
    return simd_arg_to_obj(&ret);
    %%>


added_intrinsics = []
def add_intrin(doc, name, *args, body=None):
    """
    Add a new NPYV intrinsic method
    """
    # 1- validate args
    # TODO
    # 2- generate doc
    doc = textwrap.dedent(doc).strip()
    doc_lines = doc.splitlines()
    doc_lines.insert(0, intrin_signature(name, *args))
    doc_lines = [f'"{l}\\n"' for l in doc_lines]
    added_intrinsics.append((''.join(doc_lines), name, args))

    if not body:
        body = globals().get(f"add_intrin_body_{len(args)}", add_intrin_body)

    <%0
    #line {{lineno()}} "{{filename()}}"
    static PyObject *simd__intrin_{{name}}(PyObject* NPY_UNUSED(self), PyObject *args)
    {
        {{body(name, *args)}}
    }
    %>
?>

Create new PyAS utlity template file _simd.dispatch.c.pyas instead of the C one _simd.dispatch.c.src
should contatins all of C python methods.

Example for how it looks(_simd.dispatch.c.pyas)
/*@targets $werror #simd_test */
<?include("_simd_easyintrin.inc.pyas")?>
<%{{lineno()}} "{{filename()}}"%>
#include "_simd.h"
#include "_simd_inc.h"
#if NPY_SIMD
#include "_simd_data.inc"
#include "_simd_convert.inc"
#include "_simd_vector.inc"
#include "_simd_arg.inc"

/*************************************************************************
 * Defining NPYV intrinsics as module functions
 *************************************************************************/
/***************************
 * Memory
 ***************************/
<?
swidth   = (128, 256, 512)
swidth_b = (16,  32,  64)
swidth_h = (64,  128, 256)
align_note = lambda sfx: "On `NEON/A32` the based pointer must be aligned on {sfx[1:]}-bit"

for sfx in "u8 s8 u16 s16 u32 s32 u64 s64 f32 f64".split():
    add_intrin(
        f"""
        Load {swidth}-bit from memory according to the enabled SIMD extension width.

        **Notes**:
            - {align_note(sfx)}
        """,
        f"load_{sfx}", f"v{sfx}", f"q{sfx}"
    )
    add_intrin(
        f"""
        Load {swidth}-bit from memory according to the enabled SIMD extension width,
        the based pointer must be aligned on a {swidth_b}-byte boundary.

        **Notes**:
            - On `NEON/A32/A64` mapping to `npyv_load_{sfx}()`
        """,
        f"loada_{sfx}", f"v{sfx}", f"q{sfx}"
    )
    add_intrin(
        """
        Load {swidth}-bit from memory using non-temporal hint according to the enabled
        SIMD extension width, the based pointer must be aligned on a {swidth_b}-byte boundary.

        **Notes**:
            - On `VSX`, `SSE < 4.1` mapping to `npyv_loada_{sfx}()`
            - On `NEON/A32/A64` mapping to `npyv_load_{sfx}()`
        """,
        f"loads_{sfx}", f"v{sfx}", f"q{sfx}"
    )
# example of a function need be defined manualy
def body_of_store(Name, Tret, Tseq, Tvec): return <%%
    #line {{lineno()}}
    simd_arg seq_arg = {.dtype = simd_data_$Tseq};
    simd_arg vec_arg = {.dtype = simd_data_$Tvec};
    if (!PyArg_ParseTuple(
        args, "O&O&:$Name",
        simd_arg_converter, &seq_arg,
        simd_arg_converter, &vec_arg
    )) {
        return NULL;
    }
    npyv_$Name(seq_arg.data.$Tseq, vec_arg.data.$Tvec);
    if (simd_sequence_fill_iterable(seq_arg.obj, seq_arg.data.$Tseq, simd_data_$Tseq)) {
        simd_arg_free(&seq_arg);
        return NULL;
    }
    simd_arg_free(&seq_arg);
    Py_RETURN_NONE;
    %%>

for sfx in "u8 s8 u16 s16 u32 s32 u64 s64 f32 f64".split():
    add_intrin(
        """
        Store {swidth}-bit from `a` into memory according to the enabled SIMD extension width.

        **Notes**:
            - {align_note(sfx)}
        """,
        f"store_{sfx}", None, f"q{sfx}", f"v{sfx}",  body=body_of_store
    )
    add_intrin(
        """
        Store {swidth}-bit from `a` into memory according to the enabled SIMD extension width,
        the based pointer must be aligned on a {swidth_b}-byte boundary.

        **Notes**:
            - On `NEON/A32/A64` mapping to `npyv_store_{sfx}()`
        """,
        f"storea_{sfx}", None, f"q{sfx}", f"v{sfx}", body=body_of_store
    )
?>
/*************************************************************************
 * Attach module functions
 *************************************************************************/
<% #line {{lineno()}} %>
static PyMethodDef simd__intrinsics_methods[] = {
<?for Doc, Name, args in added_intrinsics: <%
    #line {{lineno()}}
    {"$Name", simd__intrin_$Name, METH_VARARGS, $Doc},
%>?>
    {NULL, NULL, 0, NULL}
};
#endif // NPY_SIMD

EDIT: update examples

@seiko2plus
Copy link
Member Author

@charris, the above examples aren't final, please consider it a prototype.

@seiko2plus seiko2plus added the 57 - Close? Issues which may be closable unless discussion continued label Jan 9, 2021
@mhvk
Copy link
Contributor

mhvk commented Jan 9, 2021

@seiko2plus - while sharing worries about duplication, your solution does seem substantially nicer than using jinja2 - we wrapped a C library using ufuncs and the resulting jinja template file is not pretty. Much of the work also ends up being done in python anyway, as we base the template on c function comments.
That said, @eric-wieser is also right to warn about problems for newcomers - I never really understood what is happening in the *.c.src files...

@seiko2plus
Copy link
Member Author

@mhvk,

Much of the work also ends up being done in python anyway

That's the whole idea behind this solution.

That said, @eric-wieser is also right to warn about problems for newcomers

I see it as more friendly for newcomers if you compare it with web template engines, It still python after all.

@mhvk
Copy link
Contributor

mhvk commented Jan 17, 2021

I see it as more friendly for newcomers if you compare it with web template engines, It still python after all.

If one were to chose a single tool, indeed, but there are already a couple... Though perhaps this can already replace something? In particular, tempita seems to be used for exactly two files: numpy/random/_bounded_integers.p{yx,xd}.in, so perhaps one can just rewrite those two and remove tempita? (though looking at #8096, it seems to be there since cython uses it?! And from the discussion at http://numpy-discussion.10968.n7.nabble.com/Vendorize-tempita-td43505.html it may be that scipy uses npy_tempita...)

Also, the main argument for an existing template engine like jinja2 is that at least some people will have seen it before (and one can easily find documentation/examples/etc), so less to learn.

@matthew-brett
Copy link
Contributor

I don't know about you - but I find I forget how to use Jinja2 every time, and have to remind myself.

@eric-wieser
Copy link
Member

I don't know about you - but I find I forget how to use Jinja2 every time, and have to remind myself.

I've had that experience a couple times, but at least it's very easy to remind yourself from the documentation and stackoverflow questions.

Base automatically changed from master to main March 4, 2021 02:05
@seiko2plus
Copy link
Member Author

Close this pr in favor of c++11, bring type safety, operators, function overloading, templates, constant expressions, alias, auto..etc is all we want to manage our SIMD kernels in a robust and efficient way.

During the last community meeting (14th of April) we had a quick discussion, and it seems everybody welcomes bringing c++-11 to NumPy.

@seiko2plus seiko2plus closed this Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement 57 - Close? Issues which may be closable unless discussion continued
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants