Skip to content

Commit bc4e0cd

Browse files
wxsBSDmalvidin
andauthored
Handle invalid unicode in metadata values. (VirusTotal#136)
* Handle invalid unicode in metadata values. In VirusTotal#135 it was brought up that you can crash the python interpreter if you have invalid unicode in a metadata value. This is my attempt to fix that by attempting to create a string, and if that fails falling back to a bytes object. On the weird chance that the bytes object fails to create I added a safety check so that we don't add a NULL ptr to the dictionary (this is how the crash was manifesting). It's debatable if we want to ONLY add strings as metadata, and NOT fallback to bytes. If we don't fall back to bytes the only other option I see is to silently drop that metadata on the floor. The tradeoff here is that now you may end up with a string or a bytes object in your metadata dictionary, which is less than ideal IMO. I'm open to suggestions on this one. Fixes VirusTotal#135 * Add error handling to conversion to Unicode Metadata test accepts stripped or original characters * Remove 'or' clause from tests and add another NULL test check. Co-authored-by: malvidin <malvidin@gmail.com>
1 parent ab3431b commit bc4e0cd

File tree

2 files changed

+20
-1
lines changed

2 files changed

+20
-1
lines changed

tests.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -692,6 +692,25 @@ def testEntrypoint(self):
692692
'rule test { condition: entrypoint >= 0 }',
693693
])
694694

695+
def testMeta(self):
696+
697+
r = yara.compile(source=r'rule test { meta: a = "foo\x80bar" condition: true }')
698+
self.assertTrue(list(r)[0].meta['a'] == 'foobar')
699+
700+
# This test ensures that anything after the NULL character is stripped.
701+
def testMetaNull(self):
702+
703+
r = yara.compile(source=r'rule test { meta: a = "foo\x00bar\x80" condition: true }')
704+
self.assertTrue(list(r)[0].meta['a'] == 'foo')
705+
706+
# This test is similar to testMeta but it tests the meta data generated
707+
# when a Match object is created.
708+
def testScanMeta(self):
709+
710+
r = yara.compile(source=r'rule test { meta: a = "foo\x80bar" condition: true }')
711+
m = r.match(data='dummy')
712+
self.assertTrue(list(m)[0].meta['a'] == 'foobar')
713+
695714
def testFilesize(self):
696715

697716
self.assertTrueRules([

yara-python.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ typedef long Py_hash_t;
4646
#endif
4747

4848
#if PY_MAJOR_VERSION >= 3
49-
#define PY_STRING(x) PyUnicode_FromString(x)
49+
#define PY_STRING(x) PyUnicode_DecodeUTF8(x, strlen(x), "ignore" )
5050
#define PY_STRING_TO_C(x) PyUnicode_AsUTF8(x)
5151
#define PY_STRING_CHECK(x) PyUnicode_Check(x)
5252
#else

0 commit comments

Comments
 (0)