Koozali.org: home of the SME Server
Obsolete Releases => SME Server 9.x => Topic started by: joost on October 31, 2019, 12:10:42 PM
-
I'm trying to save an email attachment automatically to map with qmail and reformime. I use the following script:
#!/usr/bin/env bash
# This script process mail message attachments from stdin MIME message
# Extract all PDF files attachments
# and return the MIME message to stdout for further processing
# Ensure all locale settings are set to C, to prevent
# reformime from failing MIME headers decode with
# [unknown character set: ANSI_X3.4-1968]
# See: https://bugs.gentoo.org/304093
export LC_ALL=C LANG=C LANGUAGE=C
# Setting the destination path for saved attachments
attachements='/home/e-smith/files/users/user/attachements/'
trap 'rm -f -- "$mailmessage"' EXIT # Purge temporary mail message
# Create a temporary message file
mailmessage="$(mktemp)"
# Save stdin message to tempfile
cat > "$mailmessage"
# Iterate all MIME sections from the message
while read -r mime_section; do
# Get all section info headers
section_info="$(reformime -s "$mime_section" -i <"$mailmessage")"
# Parse the Content-Type header
content_type="$(grep 'content-type' <<<"$section_info" | cut -d ' ' -f 2-)"
# Parse the Content-Name header (if available)
content_name="$(grep 'content-name' <<<"$section_info" | cut -d ' ' -f 2-)"
# Decode the value of the Content-Name header
content_name="$(reformime -h "$content_name")"
if [[ $content_type = "application/pdf" || $content_name =~ .*\.[pP][dD][fF] ]]; then
# Attachment is a PDF
if [ -z "$content_name" ]; then
# The attachment has no name, so create a random name
content_name="$(mktemp --dry-run unnamed_XXXXXXXX.pdf)"
fi
# Prepend the date to the attachment filename
filename="$(date +%Y%m%d)_$content_name"
# Save the attachment to a file
reformime -s "$mime_section" -e <"$mailmessage" >"$attachements/$filename"
fi
done < <(reformime < "$mailmessage") # reformime list all mime sections
cat <"$mailmessage" # Re-inject the message to stdout for further processing
Problem: the script saves my file name.pdf to "attachements/20191025_[unknown character set: ANSI_X3.4-1968].pdf". So some how reformime does not recognizes the Content-Name: header content. Can this be due to a bad maildrop version? How can I solve this problem?
-
I have no idea specifically but having a generic search for the error "unknown character set: ANSI_X3.4-1968" brings up some results you might want to read through.
However, reading the code suggests you may hit this problem so on the basis of the above search and this I would check your locale settings first.
# Ensure all locale settings are set to C, to prevent
# reformime from failing MIME headers decode with
# [unknown character set: ANSI_X3.4-1968]
# See: https://bugs.gentoo.org/304093
export LC_ALL=C LANG=C LANGUAGE=C
Check this for starters.
locale -v
I would also imagine that without full access to the original mail we would have a struggle to replicate this for testing.
-
Thanks. Output of
locale -v
LANG=nl_NL.UTF-8
LC_CTYPE="nl_NL.UTF-8"
LC_NUMERIC="nl_NL.UTF-8"
LC_TIME="nl_NL.UTF-8"
LC_COLLATE="nl_NL.UTF-8"
LC_MONETARY="nl_NL.UTF-8"
LC_MESSAGES="nl_NL.UTF-8"
LC_PAPER="nl_NL.UTF-8"
LC_NAME="nl_NL.UTF-8"
LC_ADDRESS="nl_NL.UTF-8"
LC_TELEPHONE="nl_NL.UTF-8"
LC_MEASUREMENT="nl_NL.UTF-8"
LC_IDENTIFICATION="nl_NL.UTF-8"
LC_ALL=
I added export LC_ALL=C LANG=C LANGUAGE=C
to my script because the script didn't work. Somebody suggests that I have a bad maildrop version, see related mail-filter maildrop bug: bugs.gentoo.org/304093. My maildrop version seems to be 2.5.0. How can I upgrade this package to maybe maildrop 2.9.3?
-
With SME v9.x you should be on 2.5.x
rpm -qa |grep maildrop
maildrop-2.5.0-13.el6.x86_64
-
Yes, indeed. The package I have is
maildrop-2.5.0-13.el6.x86_64
. Somebody is suggesting that the scipt works, but that they use maildrop 2.9.3. Is there a way to upgrade this?
-
The link you posted to the bug said this issue was fixed in 2.4.1 so your version should be ok.
To get a newer version you'd have to build your own package.
And even then it may not actually solve your issue.
There are other comments out there eg
https://confluence.atlassian.com/confkb/filesystem-encoding-is-written-as-ansi_x3-4-1968-even-though-the-server-is-set-to-utf-8-658735809.html
You might need to try a bit more investigation.
You also might try and debug what is going on in your script.
-
If I try the debug the script by adding:
# DEBUG<br>
exec 5> debug.txt
BASH_XTRACEFD="5"
PS4='$LINENO: '
set -x
to the script.
The output is:
17: export LC_ALL=C LANG=C LANGUAGE=C
17: LC_ALL=C
17: LANG=C
17: LANGUAGE=C
20: attachements=/home/e-smith/files/users/pdf/home/
22: trap 'rm -f -- "$mailmessage"' EXIT
225: mktemp
25: mailmessage=/tmp/tmp.yFIFzj3M1e
28: cat
31: read -r mime_section
558: reformime
334: reformime -s 1 -i
34: section_info='section: 1
content-type: multipart/mixed
content-transfer-encoding: 8bit
charset: ISO-8859-1
content-language: nl-NL
starting-pos: 0
starting-pos-body: 966
ending-pos: 11065
line-count: 170
body-line-count: 150'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=multipart/mixed
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=
443: reformime -h ''
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ multipart/mixed = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
45: [[ [unknown character set: ANSI_X3.4-1968] =~ .*\.[pP][dD][fF] ]]
31: read -r mime_section
334: reformime -s 1.1 -i
34: section_info='section: 1.1
content-type: text/plain
content-transfer-encoding: 7bit
charset: utf-8
starting-pos: 1050
starting-pos-body: 1138
ending-pos: 1146
line-count: 6
body-line-count: 3'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=text/plain
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=
443: reformime -h ''
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ text/plain = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
45: [[ [unknown character set: ANSI_X3.4-1968] =~ .*\.[pP][dD][fF] ]]
31: read -r mime_section
334: reformime -s 1.2 -i
34: section_info='section: 1.2
content-type: application/pdf
content-name: test.pdf
content-transfer-encoding: base64
charset: ISO-8859-1
content-disposition: attachment
content-disposition-filename: test.pdf
starting-pos: 1186
starting-pos-body: 1323
ending-pos: 11023
line-count: 138
body-line-count: 132'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=application/pdf
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=test.pdf
443: reformime -h test.pdf
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ application/pdf = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
47: '[' -z '[unknown character set: ANSI_X3.4-1968]' ']'
552: date +%Y%m%d
52: filename='20191130_[unknown character set: ANSI_X3.4-1968]'
55: reformime -s 1.2 -e
31: read -r mime_section
60: cat
1: rm -f -- /tmp/tmp.yFIFzj3M1e
It seems like something goes wrong here:
# Decode the value of the Content-Name header
content_name="$(reformime -h "$content_name")"
I don't understand this. Can you help me?
-
Solved it by adding the
-c
option to refomime. Thus resulting in: content_name="$(reformime -c UTF-8 -h "$content_name")"
source: sourceforge.net/p/courier/mailman/message/24972857
-
Winner :-)
-
Nice find. Well done!!