Skip to content

Make a beginning of parsing GNU date items with nom #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 43 additions & 26 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ readme = "README.md"
[dependencies]
regex = "1.8"
chrono = { version="0.4", default-features=false, features=["std", "alloc", "clock"] }
nom = "7.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nom = "7.1"
nom = "7.1"

3 changes: 3 additions & 0 deletions fuzz/fuzz_targets/from_str.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
// For the full copyright and license information, please view the LICENSE
// file that was distributed with this source code.

#![no_main]

use libfuzzer_sys::fuzz_target;
Expand Down
3 changes: 3 additions & 0 deletions fuzz/fuzz_targets/parse_datetime_from_str.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
// For the full copyright and license information, please view the LICENSE
// file that was distributed with this source code.

#![no_main]

use libfuzzer_sys::fuzz_target;
Expand Down
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

// Expose parse_datetime
pub mod parse_datetime;
pub mod parse_items;

use chrono::{Duration, Local, NaiveDate, Utc};
use regex::{Error as RegexError, Regex};
Expand Down
46 changes: 46 additions & 0 deletions src/parse_items.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
// For the full copyright and license information, please view the LICENSE
// file that was distributed with this source code.

use nom::error::Error;
use nom::{IResult, Parser};

pub mod items;
pub(self) mod fixed_number;
pub(self) mod nano_seconds;

type PError<'i> = Error<&'i str>;
type PResult<'i, O> = IResult<&'i str, O, PError<'i>>;

fn singleton_list<'i, O>(mut inner: impl Parser<&'i str, O, PError<'i>>) -> impl Parser<&'i str, Vec<O>, PError<'i>> {
move |input: &'i str| {
let (tail, result) = inner.parse(input)?;
Ok((tail, vec![result]))
}
}

#[cfg(test)]
mod tests {
macro_rules! ptest {
($name:ident : $parser:ident($input:literal) => $out:expr, $tail:literal) => {
#[test]
fn $name() {
assert_eq!(
$parser.parse($input),
Ok((
$tail,
$out
))
);
}
};
($name:ident : $parser:ident($input:literal) => X) => {
#[test]
fn $name() {
let result = $parser.parse($input);
assert!(result.is_err(), "{:?}", result);
}
};
}

pub(super) use ptest;
}
72 changes: 72 additions & 0 deletions src/parse_items/fixed_number.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
// For the full copyright and license information, please view the LICENSE
// file that was distributed with this source code.

use nom::combinator::all_consuming;
use nom::{bytes::complete::take, character::complete, combinator::map_parser, Parser};

use crate::parse_items::PError;

macro_rules! fixed_number_impl {
($($t:ident),+) => {$(
#[allow(dead_code)]
pub fn $t<'i>(width: usize) -> impl Parser<&'i str, $t, PError<'i>> {
move |input: &'i str| {
map_parser(take(width), all_consuming(complete::$t)).parse(input)
}
}
)+};
}

fixed_number_impl! { u8, u16, u32, u64, u128 }

#[cfg(test)]
mod tests {
use crate::parse_items::{tests::ptest, PResult};

use super::*;

#[test]
fn zero_width() {
let result = u32(0).parse("1234");
assert!(result.is_err(), "{:?}", result);
}

#[test]
fn one_width() {
assert_eq!(u32(1).parse("1234"), Ok(("234", 1)));
}

#[test]
fn does_not_fit_type() {
let result = u8(4).parse("1234");
assert!(result.is_err(), "{:?}", result);
}

#[test]
fn does_not_fit_negative() {
let result = u8(3).parse("-123");
assert!(result.is_err(), "{:?}", result);
}

#[test]
fn input_too_short() {
let result = u32(6).parse("1234");
assert!(result.is_err(), "{:?}", result);
}

#[test]
fn three() {
assert_eq!(u32(3).parse("123abc"), Ok(("abc", 123)));
}

#[test]
fn leading_zeroes() {
assert_eq!(u32(3).parse("00123"), Ok(("23", 1)));
}

#[test]
fn non_digits() {
let result = u32(4).parse("123abc");
assert!(result.is_err(), "{:?}", result);
}
}
27 changes: 27 additions & 0 deletions src/parse_items/gnu-items.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<!--
For the full copyright and license information, please view the LICENSE
file that was distributed with this source code.
-->

## General date syntax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have examples

Copy link
Author

@wanderinglethe wanderinglethe Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't knew I committed this file, not really sure if we should add this because it's straight out of the GNU docs.

Although I guess we can have some supported GNU features in the documentation.
And maybe some tracking issue for all GNU features.

https://www.gnu.org/software/coreutils/manual/html_node/General-date-syntax.html

A date string can have different flavours (items):
- calendar date
- time of day
- time zone
- combined date and time of day
- day of the week
- relative
- numbers
- empty string (beginning of the day)

Some properties:
- the order of items should not matter
- whitespace may be omitted when unambiguous
- ordinal numbers may be written out in some items
- comments between parentheses '(', ')'
- alphabetic case is ignored
- hyphens not followed by digit are ignored
- leading zeros on numbers are ignored
- leap seconds on supported systems
Loading