Sample Header Ad - 728x90

SpamAssassin rules work only during first scan with "spamc"

0 votes
0 answers
67 views
I've made a bunch of custom rules in my SpamAssassin setup. When I start the service (systemctl start spamassassin), those rules are properly evaluated on my test spam e-mail:
[antek@mailgate ~]$ cat /tmp/spam.txt | spamc -R -l
23.4/5.0
Spam detection software, running on the system "mailgate.anadoxin.org",
has identified this incoming email as possible spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  jeśli wiadomość nie wyświetliła się poprawnie, kliknij
tutaj by przejść do oferty. astra jesienne ceny opla od 89 900 zł lub
760 zł netto/mies. f gg 9086 sprawdź opel niniejszy materiał ni [...]

Content analysis details:   (23.4 points, 5.0 required)

pts rule name              description
---- ---------------------- --------------------------------------------------
-5.0 RCVD_IN_DNSWL_HI       RBL: Sender listed at https://www.dnswl.org/ , high
                            trust
                            [91.185.184.51 listed in list.dnswl.org]
 0.0 URIBL_BLOCKED          ADMINISTRATOR NOTICE: The query to URIBL was blocked.
                            See
                            http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block 
                            for more information.
                            [URI: xya.pl]
                            [URI: doubleclick.net]
                            [URI: dobrebazy.pl]
                            [URI: brightsender.pl]
                            [URI: ddtracker.pl]
                            [URI: lrmailr.pl]
-0.0 SPF_HELO_PASS          SPF: HELO matches SPF record
-0.1 DKIM_VALID_AU          Message has a valid DKIM or DK signature from author's
domain
-0.1 DKIM_VALID             Message has at least one valid DKIM or DK signature
 0.1 DKIM_SIGNED            Message has a DKIM or DK signature, not necessarily valid
-0.5 BAYES_05               BODY: Bayes spam probability is 1 to 5%
                            [score: 0.0121]
 2.5 GENERIC_MAILING        mailing@ in From: email address
 20 NIP_SPAM_1             BODY: No description available.
 0.5 NUMER_NIP              BODY: No description available.
 1.0 KLIKNIJ_TUTAJ          BODY: No description available.
 1.0 OFERT                  BODY: Ofert
 0.2 BAD_WORDS_2            BODY: No description available.
 0.0 HTML_MESSAGE           BODY: HTML included in message
 0.0 HTML_IMAGE_RATIO_02    BODY: HTML has a low ratio of text to image area
 0.3 CEN_IN_BODY            RAW: cen
-0.3 CENTER_IN_BODY         RAW: No description available.
 0.5 UNSUBSCRIBE            RAW: Unsubscribe in body
 0.5 NO_TO_NAME             No Real Name in To: header
 0.0 T_KAM_HTML_FONT_INVALID Test for Invalidly Named or Formatted Colors
                            in HTML
 0.9 RAZOR2_CHECK           Listed in Razor2 (http://razor.sf.net/) 
 1.9 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
                            [cf: 100]
... but when I use "spamc" to evaluate the e-mail again right after the first try, my custom rules are not there anymore, and the e-mail is not evaluated as spam:
-2.8/5.0
Spam detection software, running on the system "mailgate.anadoxin.org",
has NOT identified this incoming email as spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  jeśli wiadomość nie wyświetliła się poprawnie, kliknij
tutaj by przejść do oferty. astra jesienne ceny opla od 89 900 zł lub
760 zł netto/mies. f gg 9086 sprawdź opel niniejszy materiał ni [...]

Content analysis details:   (-2.8 points, 5.0 required)

pts rule name              description
---- ---------------------- --------------------------------------------------
-5.0 RCVD_IN_DNSWL_HI       RBL: Sender listed at https://www.dnswl.org/ , high
                            trust
                            [91.185.184.51 listed in list.dnswl.org]
 0.0 URIBL_BLOCKED          ADMINISTRATOR NOTICE: The query to URIBL was blocked.
                            See
                            http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block 
                            for more information.
                            [URI: xya.pl]
                            [URI: doubleclick.net]
                            [URI: ddtracker.pl]
                            [URI: brightsender.pl]
                            [URI: dobrebazy.pl]
                            [URI: lrmailr.pl]
-0.0 SPF_HELO_PASS          SPF: HELO matches SPF record
-0.1 DKIM_VALID_AU          Message has a valid DKIM or DK signature from author's
domain
-0.1 DKIM_VALID             Message has at least one valid DKIM or DK signature
 0.1 DKIM_SIGNED            Message has a DKIM or DK signature, not necessarily valid
-0.5 BAYES_05               BODY: Bayes spam probability is 1 to 5%
                           [score: 0.0121]
 0.0 HTML_MESSAGE           BODY: HTML included in message
 0.0 HTML_IMAGE_RATIO_02    BODY: HTML has a low ratio of text to image area
 0.0 T_KAM_HTML_FONT_INVALID Test for Invalidly Named or Formatted Colors
                           in HTML
 0.9 RAZOR2_CHECK           Listed in Razor2 (http://razor.sf.net/) 
 1.9 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
[cf: 100]
This result is the same for any subsequent tests using "spamc". When I restart spamassassin service, I once again have the "this e-mail is spam" verdict, because my custom rules are evaluated, but again only for the first try. Any subsequent invocations omit my custom rules, and the e-mail is not spam anymore. When I run spamd in debug mode, it sees my custom config files, but does not load them, because they are "already loaded"; is this some bug in spamassassin?
Oct 18 07:51:51.935  dbg: prefork: ordered 1105768 to accept
Oct 18 07:51:51.938  dbg: spamd: select() on fd bit field 00000110, timeout 0.5, not locked
Oct 18 07:51:51.939  dbg: prefork: sysread(7) not ready, wait max 300.0 secs
Oct 18 07:51:51.941  dbg: spamd: accept() on fd 5
Oct 18 07:51:51.943  dbg: prefork: child 1105768: entering state 2
Oct 18 07:51:51.944  dbg: prefork: new lowest idle kid: 1105769
Oct 18 07:51:51.951  dbg: netset:  cached lookup on ::1, 2 networks, result: 1
Oct 18 07:51:51.952  info: spamd: connection from ::1 [::1]:44452 to port 783, fd 5
Oct 18 07:51:51.954  dbg: util: get_user_groups: uid is 1000
Oct 18 07:51:51.956  dbg: util: get_user_groups: added 10 (wheel) to group list which is now: 1000 10
Oct 18 07:51:51.959  info: spamd: setuid to antek succeeded
Oct 18 07:51:51.961  dbg: config: parsing file /home/antek/.spamassassin/user_prefs
Oct 18 07:51:51.963  dbg: config: fixed relative path: /home/antek/.spamassassin/custom.cf
Oct 18 07:51:51.964  dbg: config: using "/home/antek/.spamassassin/custom.cf" for included file
Oct 18 07:51:51.966  dbg: config: skipping already read file: /home/antek/.spamassassin/custom.cf
Oct 18 07:51:51.967  dbg: config: parsing file /home/antek/.spamassassin/user_prefs
Oct 18 07:51:51.968  dbg: config: fixed relative path: /home/antek/.spamassassin/playfire.cf
Oct 18 07:51:51.969  dbg: config: using "/home/antek/.spamassassin/playfire.cf" for included file
Oct 18 07:51:51.970  dbg: config: skipping already read file: /home/antek/.spamassassin/playfire.cf
Oct 18 07:51:51.971  dbg: config: parsing file /home/antek/.spamassassin/user_prefs
Oct 18 07:51:51.972  dbg: config: fixed relative path: /home/antek/.spamassassin/listonic.cf
Oct 18 07:51:51.973  dbg: config: using "/home/antek/.spamassassin/listonic.cf" for included file
Oct 18 07:51:51.974  dbg: config: skipping already read file: /home/antek/.spamassassin/listonic.cf
[... snip ...]
What could be the problem here? For now it works when I change the source code:
1887 sub read_cf_file {
1888   my($self, $path) = @_;
1889   my $txt = '';
1890
1891   #if ($self->{cf_files_read}->{$path}++) {
1892   #dbg("config: skipping already read file: $path");
1893   #return $txt;
1894   #}
but I have an impression that this should be solved in some better way :)
Asked by antekone (722 rep)
Oct 18, 2023, 06:18 AM