Commit Graph

18 Commits

Author SHA1 Message Date
Frank Denis a71b531d2e Re-add -o / --output-file 2020-04-21 23:40:58 +02:00
Frank Denis dcd6f8448d Revert "Improve generate-domains-blacklist.py to remove redundant lines (#1184)"
This reverts commit 58871de725.
2020-04-21 23:08:40 +02:00
Huhni 58871de725
Improve generate-domains-blacklist.py to remove redundant lines (#1184)
* Improve script to remove redundant lines

Let the script remove those lines that are covered by regular expressions already

* add optional "-o OUTPUT_FILE" argument 

This ensures that UTF-8 is used.
The redirect to file functionality from before is maintained, because "default=None" is used for the -o argument

I also fixed the formatting slightly to avoid newlines at the beginning of the file.

* improve glob matching

- rename regexes into globs 
- only check trusted (local) files for globs
- use fnmatch instead of manually converting globs into regular expressions and matching them
- modify is_glob function to check only for the following characters: * [ ] ?
- improve get_lines_with_globs function, by using the native filter and lambda functions
- improve covered_by_glob function, by checking if line is part of glob_list, instead of calling is_glob again
- print "ignored entries due to globs in local-additions" to the output as well to better differentiate from other duplicates
2020-04-21 23:07:32 +02:00
Cristian-J 05593a8bbd Ignore links that start with a hyphen or a dot
If you use filter blacklists you'll end up with many invalid links that start with a hyphen or a dot in the final blacklist.
2020-01-08 12:57:22 -07:00
Frank Denis 69f00ca977 Don't use the message attribute to get an error message
Fixes #1123
2019-12-23 18:58:39 +01:00
Frank Denis a308c76191 Format 2019-12-23 18:55:37 +01:00
Frank Denis 77f2eef886 Change the user agent 2019-08-27 18:26:29 +02:00
Frank Denis 5f29677400 Format 2019-08-27 18:25:47 +02:00
Simon R f3e032f88a fix remaining urllib2 reference (#830) 2019-05-22 20:50:45 +02:00
Simon R bc5e4f0544 make generate-domains-blacklist.py compatible to both python2 and python3 (#828)
* update domains-blacklist-all.conf: Quidsup NoTrack moved to gitlab

* make generate-domains-blacklist.py python3 compatible

* fix whitespace
2019-05-22 10:15:08 +02:00
Frank Denis 5ee3512460 generate-domains-blacklist.py: properly handle time restrictions
Fixes #710
2019-02-15 00:03:02 +01:00
Frank Denis c142923b46 Add a dedicated function for trusted lists 2019-02-14 23:27:19 +01:00
Frank Denis 9224e79c59 Add NoTracking's list to the example blacklist configuration
Implement dnsmasq-style filters by the way
2018-03-26 20:43:42 +02:00
Frank Denis 25f1de385b Minor changes for clarity 2018-02-19 16:38:06 +01:00
Alexandre L 9b701d8121
Support time-based blacklist from domains-time-restricted.txt
* Modified list_from_url to load_from_url to avoid reading the `time_restricted` file twice (1 for output, 1 for whitelist)
2018-02-19 14:38:43 +01:00
Frank Denis ac395b03fc Bump the default timeout up 2018-02-11 20:51:48 +01:00
Benjamin Dos Santos 53e9c79194
feat: add a flag to setup the open URL timeout
Sometimes I randomly encounter a timeout when I generate blacklist. This commit add the
ability to increase the timeout delay (default to 10s).
2018-02-11 19:24:21 +01:00
Frank Denis 35e32b823f Import the generate-domains-blacklists tool 2018-01-17 15:28:07 +01:00