From aba1021cecc441a04e5506717aa4ded373d038d4 Mon Sep 17 00:00:00 2001
From: Rodrigo Tobar <rtobar@icrar.org>
Date: Thu, 25 Nov 2021 09:42:03 +0800
Subject: [PATCH 1/4] Mejora varios aspectos del script create_dict
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

La versión anterior de este script cargaba los archivos completos antes
de agregarlos al set, lo que usa más memoria de lo requerido. Esta
operación se puede realizar más eficientemente usando generadores, con
lo cual los archivos se procesan línea a línea.

El script también innecesariamente trataba de remover el string vacío
del set resultando, lo cual se puede prevenir al momento de agregar
elementos al set.

Finalmente, el docstring todavía tenía referencias al antiguo archivo
"dict", las cuales fueron quitadas. El docstring también fue cambiado de
posición de tal manera que aparezca antes de los imports.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
---
 scripts/create_dict.py | 25 ++++++++-----------------
 1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/scripts/create_dict.py b/scripts/create_dict.py
index 5d95ea4a8a..57f1078e9a 100644
--- a/scripts/create_dict.py
+++ b/scripts/create_dict.py
@@ -1,29 +1,20 @@
-from pathlib import Path
-
 """
 Script to generate the 'dict.txt' dictionary based
-on the custom dictionaries under the 'dictionaries/' directory,
-but also considering the old words from the 'dict' file.
-
-This was done with:
-    awk 1 dict dictionaries/*.txt > dict.txt
-but the problem was that windows users, not using Git bash
-have the problem that 'awk' is not a valid command, so this
-enable them to use the script instead.
+on the custom dictionaries under the 'dictionaries/' directory.
 """
 
+from pathlib import Path
+
 entries = set()
 
 # Read custom dictionaries
 for filename in Path("dictionaries").glob("*.txt"):
     with open(filename, "r") as f:
-        lines = [i.rstrip() for i in f.readlines()]
-    if lines:
-        entries.update(set(lines))
-        del lines
-
-# Remove empty string, from empty lines
-entries.remove("")
+        entries.update(
+            stripped_line
+            for stripped_line in (line.strip() for line in f.readlines())
+            if stripped_line
+        )
 
 # Write the 'dict.txt' file
 with open("dict.txt", "w") as f:

From a0c417beda2e04885bdaeaf18d31c593e3553e57 Mon Sep 17 00:00:00 2001
From: Rodrigo Tobar <rtobar@icrar.org>
Date: Tue, 30 Nov 2021 14:53:10 +0800
Subject: [PATCH 2/4] Mueve create_dict.py -> check_spell.py
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Los usuarios que quieren realizar un chequeo ortográfico sobre los
archivos .po actualmente tienen que realizar dos pasos: invocar el sript
create_dict.py, el cual genera un archivo dict.txt (un diccionario que
contiene a todos los diccionarios dentro de dictionaries/), para luego
invocar pospell usando este diccionario generado.

Este commit toma el script create_dict.py y le agrega la funcionalidad
de invocar pospell luego de generar el diccionario dict.txt (que ahora
se genera como un archivo temporal). Con esto, el script ahora cumple la
función completa de relizar el chequeo de ortografía, por lo que su
nombre ha cambiado a check_spell.py

Si no se entregan argumentos, el script corrige todos los archivos .po
del repositorio; en caso contrario el usuario debe especificar arhivos
.po a ser chequeados.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
---
 scripts/check_spell.py | 35 +++++++++++++++++++++++++++++++++++
 scripts/create_dict.py | 24 ------------------------
 2 files changed, 35 insertions(+), 24 deletions(-)
 create mode 100644 scripts/check_spell.py
 delete mode 100644 scripts/create_dict.py

diff --git a/scripts/check_spell.py b/scripts/check_spell.py
new file mode 100644
index 0000000000..e9193665f6
--- /dev/null
+++ b/scripts/check_spell.py
@@ -0,0 +1,35 @@
+"""
+Script to check the spelling of one, many or all .po files based
+on the custom dictionaries under the 'dictionaries/' directory.
+"""
+
+from pathlib import Path
+import sys
+import tempfile
+
+import pospell
+
+# Read custom dictionaries
+entries = set()
+for filename in Path("dictionaries").glob("*.txt"):
+    with open(filename, "r") as f:
+        entries.update(
+            stripped_line
+            for stripped_line in (line.strip() for line in f.readlines())
+            if stripped_line
+        )
+
+# Write merged dictionary file
+output_filename = tempfile.mktemp(suffix="_merged_dict.txt")
+with open(output_filename, "w") as f:
+    for e in entries:
+        f.write(e)
+        f.write("\n")
+
+# Run pospell either against all files or the file given on the command line
+po_files = sys.argv[1:]
+if not po_files:
+    po_files = Path(".").glob("*/*.po")
+
+errors = pospell.spell_check(po_files, personal_dict=output_filename, language="es_ES")
+sys.exit(0 if errors == 0 else -1)
diff --git a/scripts/create_dict.py b/scripts/create_dict.py
deleted file mode 100644
index 57f1078e9a..0000000000
--- a/scripts/create_dict.py
+++ /dev/null
@@ -1,24 +0,0 @@
-"""
-Script to generate the 'dict.txt' dictionary based
-on the custom dictionaries under the 'dictionaries/' directory.
-"""
-
-from pathlib import Path
-
-entries = set()
-
-# Read custom dictionaries
-for filename in Path("dictionaries").glob("*.txt"):
-    with open(filename, "r") as f:
-        entries.update(
-            stripped_line
-            for stripped_line in (line.strip() for line in f.readlines())
-            if stripped_line
-        )
-
-# Write the 'dict.txt' file
-with open("dict.txt", "w") as f:
-    for e in entries:
-        f.write(e)
-        f.write("\n")
-print("Created 'dict.txt'")

From b4bd50eafd93b058bdd38182535fa61c5de71c47 Mon Sep 17 00:00:00 2001
From: Rodrigo Tobar <rtobar@icrar.org>
Date: Thu, 25 Nov 2021 11:13:47 +0800
Subject: [PATCH 3/4] Cambia GitHub workflows, Makefile y pre-commit hooks para
 usar check_spell.py
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

En el caseo del workflow de GitHub y del Makefile, el cambio es simple:
basta con sustituir la invocación a los dos scripts anteriores por el
nuevo y ya todo funciona.

En el caso de pre-commit, podemos sacar el hook que corre pospell
directamente, y en cambio declrar el paquete pospell como una
dependencia aditional del nuestro hook local que ahora corre
check_spell.py directamente.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
---
 .github/workflows/main.yml |  3 +--
 .pre-commit-config.yaml    | 14 +++++---------
 Makefile                   |  3 +--
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index eb4981e8fa..ce3f782041 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -31,7 +31,6 @@ jobs:
         run: powrap --check --quiet **/*.po
       - name: Pospell
         run: |
-          python scripts/create_dict.py
-          pospell -p dict.txt -l es_ES **/*.po
+          python scripts/check_spell.py
       - name: Construir documentación
         run: PYTHONWARNINGS=ignore::FutureWarning sphinx-build -j auto -W --keep-going -b html -d cpython/Doc/_build/doctree -D language=es . cpython/Doc/_build/html
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index ed6e250ad1..1d5d78b9e1 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -5,13 +5,9 @@ repos:
     -   id: powrap
 -   repo: local
     hooks:
-    -   id: merge-dicts
-        name: merge-dicts
-        entry: python ./scripts/create_dict.py
+    -   id: check-spell
+        name: Check spelling
+        entry: python ./scripts/check_spell.py
         language: python
-# This one requires package ``hunspell-es_es`` in Archlinux
--   repo: https://github.com/AFPy/pospell
-    rev: v1.1
-    hooks:
-    -   id: pospell
-        args: ['--personal-dict', 'dict.txt', '--language', 'es_ES', '--language', 'es_AR']
+        additional_dependencies: ['pospell>=1.1']
+        files: \.po$
diff --git a/Makefile b/Makefile
index becb1f77b5..d59324f1ee 100644
--- a/Makefile
+++ b/Makefile
@@ -89,8 +89,7 @@ progress: venv
 
 .PHONY: spell
 spell: venv
-	$(VENV)/bin/python scripts/create_dict.py
-	$(VENV)/bin/pospell -p dict.txt -l es_ES **/*.po
+	$(VENV)/bin/python scripts/check_spell.py
 
 
 .PHONY: wrap

From a3938077d172b9fd367dd23ebe27ba97495e5e58 Mon Sep 17 00:00:00 2001
From: Rodrigo Tobar <rtobar@icrar.org>
Date: Thu, 25 Nov 2021 15:54:21 +0800
Subject: [PATCH 4/4] =?UTF-8?q?Actualiza=20FAQ=20sobre=20c=C3=B3mo=20hacer?=
 =?UTF-8?q?=20chequeo=20de=20ortograf=C3=ADa?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Ya no es necesario que los usuarios realizen dos pasos por separado,
sino que ahora solo necesitan correr un solo script.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
---
 .overrides/faq.rst | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/.overrides/faq.rst b/.overrides/faq.rst
index 06d2d44267..b68a494914 100644
--- a/.overrides/faq.rst
+++ b/.overrides/faq.rst
@@ -26,8 +26,7 @@ pospell. Pospell puede ser instalada en tu entorno de Python empleando pip
 Una vez instalado, para chequear el fichero .po sobre el que estás trabajando,
 ejecuta desde el directorio principal del repo::
 
-    python scripts/create_dict.py  # para crear el archivo 'dict.txt'
-    pospell -p dict.txt -l es_ES path/tu_fichero.po
+    python scripts/check_spell.py path/tu_fichero.po
 
 pospell emplea la herramienta de diccionarios hunspell. Si pospell falla dando
 como error que no tiene hunspell instalado, lo puedes instalar así: