save docs

This commit is contained in:
Martin Rotter 2021-03-05 19:58:50 +01:00 committed by Martin Rotter
parent 654997227b
commit 57cb76b412
4 changed files with 65 additions and 66 deletions

View File

@ -8,13 +8,14 @@
* [Localizations](#localizations)
* [Videos](#videos)
* [Web-based and lite app variants](#web-based-and-lite-app-variants)
* [RSS Guard 3 vs. RSS Guard 4](#RSS-Guard-3-vs.-RSS-Guard-4)
* [RSS Guard 3 vs. RSS Guard 4](#rss-guard-3-vs.-rss-guard-4)
* [Features](#features)
* [List of main features](#list-of-main-features)
* [Supported feed formats and online feed services](Feed-formats.md)
* [Message filtering](Message-filters.md)
* [Database backends](#database-backends)
* [Google Reader API](#google-reader-api)
* [Websites scraping](#websites-scraping)
* [Gmail](#gmail)
* [Feedly](#feedly)
* [Labels](Labels.md)
@ -110,10 +111,7 @@ RSS Guard is simple (yet powerful) feed reader. It is able to fetch the most kno
* Nextcloud News (RSS Guard 3.1.0+),
* Inoreader (RSS Guard 3.5.0+),
* Gmail with e-mail sending (RSS Guard 3.7.1+).
* FreshRSS (RSS Guard 3.9.0+),
* The Old Reader (RSS Guard 3.9.0+),
* Bazqux (RSS Guard 3.9.0+),
* Reedah (RSS Guard 3.9.0+),
* Google Reader API (FreshRSS, The Old Reader, Bazqux, Reedah and others) (RSS Guard 3.9.0+),
* Feedly (RSS Guard 3.9.0+).
* core:
* support for all feed formats (RSS/RDF/ATOM/JSON),
@ -122,7 +120,7 @@ RSS Guard is simple (yet powerful) feed reader. It is able to fetch the most kno
* universal plugin for online services with [Google Reader API](#google-reader-api),
* possibility of using custom 3rd-party feed synchronization services,
* feed metadata fetching including icons,
* support for [scraping websites](Feed-formats.md#websites-scraping-and-related-advanced-features) which do not offer RSS/ATOM feeds and other related advanced features,
* support for [scraping websites](#websites-scraping) which do not offer RSS/ATOM feeds and other related advanced features,
* simple internal Chromium-based web viewer (or alternative version with simpler and much more lightweight internal viewer),
* scriptable [message filtering](#message-filtering),
* downloader with own tab and support for up to 6 parallel downloads,
@ -137,8 +135,8 @@ RSS Guard is simple (yet powerful) feed reader. It is able to fetch the most kno
* fully-featured recycle bin,
* multiple data backend support,
* SQLite,
* MySQL.
* ability to specify target database by its name (MySQL backend),
* MariaDB.
* ability to specify target database by its name (MariaDB backend),
* support for `feed://` URI scheme.
* user interface:
* message list filter with regular expressions,
@ -184,8 +182,58 @@ Note that even when all Google Reader API enabled services should follow the API
For example The Old Reader does not seem to offer tags/labels functionality, therefore tags/labels in RSS Guard are not synchronized, but you can still use offline labels.
## Websites scraping
> **Only proceed if you consider yourself to be a power user and you know what you are doing!**
RSS Guard 3.9.0+ offers extra advanced features which are inspired by [Liferea](https://lzone.de/liferea/).
You can select source type of each feed. If you select `URL`, then RSS Guard simply downloads feed file from given location and behave like everyone would expect.
However, if you choose `Script` option, then you cannot provide URL of your feed and you rely on custom script to generate feed file and provide its contents to **standard output**. Resulting data written to standard output should be valid feed file, for example RSS or ATOM XML file.
`Fetch it now` button also works with `Script` option. Therefore, if your source script and (optional) post-process script in cooperation deliver a valid feed file to the output, then all important metadata, like title or icon of the feed, can be automagically discovered.
<img src="images/scrape-source-type.png" width="50%">
Any errors in your script must be written to **error output**.
Note that you must provide full execution line to your custom script, including interpreter binary path and name and all that must be written in special format `<interpreter>#<argument1>#<argument2>#....`. The `#` character is there to separate interpreter and individual arguments. I had to select some character as separator because simply using space ` ` is not that easy as it might sound, because sometimes space could be a part of an argument sometimes argument separator etc.
Interpreter must be provided in all cases, arguments do not have to be. For example `bash.exe#` is valid execution line, as well as `bash#-c#cat feed.atom`. Note the difference in interpreter's binary name suffix. Also be very carefully about arguments quoting. Some examples of valid and tested execution lines are:
| Command | Explanation |
|---------|-------------|
| `bash#-c#curl https://github.com/martinrotter.atom` | Downloads ATOM feed file with Bash and Curl. |
| `Powershell#Invoke-WebRequest 'https://github.com/martinrotter.atom' \| Select-Object -ExpandProperty Content` | Downloads ATOM feed file with Powershell. |
| `php#tweeper.php#-v#0#https://twitter.com/NSACareers` | Scrape Twitter RSS feed file with [Tweeper](https://git.ao2.it/tweeper.git). Tweeper is utility which is able to produce RSS feed from Twitter and other similar social platforms. |
<img src="images/scrape-source.png" width="50%">
Note that the above examples are cross-platform and you can use the exact same command on Windows, Linux or Mac OS X, if your operating system is properly configured.
RSS Guard offers placeholder `%data%` which is automatically replaced with full path to RSS Guard's [user data folder](Documentation.md#portable-user-data). You can, therefore, use something like this as source script line: `bash#%data%/scripts/download-feed.sh`.
Also, working directory of process executing the script is set to RSS Guard's user data folder.
There are some examples of website scrapers [here](https://github.com/martinrotter/rssguard/tree/master/resources/scripts/scrapers), most of the are written in Python 3, thus their execution line is `python#script.py`.
After your source feed data are downloaded either via URL or custom script, you can optionally post-process the data with one more custom script, which will take **raw source data as input** and must produce processed valid feed data to **standard output** while printing all error messages to **error output**.
Format of post-process script execution line is the same as above.
<img src="images/scrape-post.png" width="50%">
Typical post-processing filter might do things like advanced CSS formatting or filtering of feed file entries or removing ads:
| Command | Explanation |
|---------|-------------|
| `bash#-c#xmllint --format -` | Pretty-print input XML feed data. |
It's completely up to you if you decide to only use script as `Source` of the script or separate your custom functionality between `Source` script and `Post-process` script. Sometimes you might need different `Source` scripts for different online sources and the same `Post-process` script and vice versa.
## Gmail
RSS Guard includes Gmail plugin, which allows users to receive and send (!!!) e-mail messages. Plugin uses [Gmail API](https://developers.google.com/gmail/api) and offers some e-mail client-like features:
RSS Guard includes Gmail plugin, which allows users to receive and send e-mail messages in a very simple fashion. Plugin uses [Gmail API](https://developers.google.com/gmail/api) and offers some e-mail client-like features:
* Sending e-mail messages.
<img src="images/gmail-new-email.png">

View File

@ -43,52 +43,4 @@ OPML files can be exported/imported in simple dialog.
<img src="images/im-ex-feeds.png" width="80%">
<img src="images/im-ex-feeds-dialog.png" width="50%">
You just select output file (in case of OPML export), check desired feeds and hit `Export to file`.
### Websites scraping and related advanced features
> **Only proceed if you consider yourself to be a power user and you know what you are doing!**
RSS Guard 3.9.0+ offers extra advanced features which are inspired by [Liferea](https://lzone.de/liferea/).
You can select source type of each feed. If you select `URL`, then RSS Guard simply downloads feed file from given location and behave like everyone would expect.
However, if you choose `Script` option, then you cannot provide URL of your feed and you rely on custom script to generate feed file and provide its contents to **standard output**. Resulting data written to standard output should be valid feed file, for example RSS or ATOM XML file.
`Fetch it now` button also works with `Script` option. Therefore, if your source script and (optional) post-process script in cooperation deliver a valid feed file to the output, then all important metadata, like title or icon of the feed, can be automagically discovered.
<img src="images/scrape-source-type.png" width="50%">
Any errors in your script must be written to **error output**.
Note that you must provide full execution line to your custom script, including interpreter binary path and name and all that must be written in special format `<interpreter>#<argument1>#<argument2>#....`. The `#` character is there to separate interpreter and individual arguments.
Interpreter must be provided in all cases, arguments do not have to be. For example `bash.exe#` is valid execution line, as well as `bash#-c#cat feed.atom`. Note the difference in interpreter's binary name suffix. Also be very carefuly about arguments quoting. Some examples of valid and tested execution lines are:
| Command | Explanation |
|---------|-------------|
| `bash#-c#curl https://github.com/martinrotter.atom` | Downloads ATOM feed file with Bash and Curl. |
| `Powershell#Invoke-WebRequest 'https://github.com/martinrotter.atom' \| Select-Object -ExpandProperty Content` | Downloads ATOM feed file with Powershell. |
| `php#tweeper.php#-v#0#https://twitter.com/NSACareers` | Scrape Twitter RSS feed file with [Tweeper](https://git.ao2.it/tweeper.git). Tweeper is utility which is able to produce RSS feed from Twitter and other similar social platforms. |
<img src="images/scrape-source.png" width="50%">
Note that the above examples are cross-platform and you can use the exact same command on Windows, Linux or Mac OS X, if your operating system is properly configured.
RSS Guard offers placeholder `%data%` which is automatically replaced with full path to RSS Guard's [user data folder](Documentation.md#portable-user-data). You can, therefore, use something like this as source script line: `bash#%data%/scripts/download-feed.sh`.
Also, working directory of process executing the script is set to RSS Guard's user data folder.
There are some examples of website scrapers [here](https://github.com/martinrotter/rssguard/tree/master/resources/scripts/scrapers), most of the are written in Python 3, thus their execution line is `python.exe#script.py`.
After your source feed data are downloaded either via URL or custom script, you can optionally post-process the data with one more custom script, which will take **raw source data as input** and must produce processed valid feed data to **standard output** while printing all error messages to **error output**.
Format of post-process script execution line is the same as above.
<img src="images/scrape-post.png" width="50%">
Typical post-processing filter might do things like advanced CSS formatting of feed file entries, removing some ads or simply pretty-printing XML data:
| Command | Explanation |
|---------|-------------|
| `bash#-c#xmllint --format -` | Pretty-print input XML feed data. |
You just select output file (in case of OPML export), check desired feeds and hit `Export to file`.

View File

@ -6,7 +6,6 @@ RSS Guard supports _automagic_ message filtering. The filtering system is automa
foreach (feed in feeds_to_update) do
messages = download_messages(feed)
filtered_messages = filter_messages(messages)
save_messages_to_database(filtered_messages)
```
As you can see, RSS Guard processes all feeds scheduled for message downloading one by one; downloading new messages, feeding them to filtering system and then saving all approved messages to RSS Guard's database.
@ -19,18 +18,18 @@ Message filter consists of arbitrary JavaScript code which must provide function
function filterMessage() { }
```
This function must be fast and must return values which belong to enumeration `FilteringAction` from this [file](https://github.com/martinrotter/rssguard/blob/master/src/librssguard/core/message.h). You can you either direct numerical value of each enumerant, for example `2` or you can use self-descriptive enumerant name, for example `MessageObject.Ignore`. Named enumerants are supported in RSS Guard 3.8.1+. RSS Guard 3.7.1+ also offers names `MSG_ACCEPT` and `MSG_IGNORE` as aliases for `MessageObject.Accept` and `MessageObject.Ignore`.
This function must be fast and must return values which belong to enumeration `FilteringAction` from this [file](https://github.com/martinrotter/rssguard/blob/master/src/librssguard/core/messageobject.h). You can you use either direct numerical value of each enumerant, for example `2` or you can use self-descriptive enumerant name, for example `MessageObject.Ignore`. There are also names `MSG_ACCEPT` and `MSG_IGNORE` as aliases for `MessageObject.Accept` and `MessageObject.Ignore`.
Each message is accessible in your script via global variable named `msg` of type `MessageObject`, see this [file](https://github.com/martinrotter/rssguard/blob/master/src/librssguard/core/message.h) for the declaration. Some properties are writable, allowing you to change contents of the message before it is written to DB. You can mark message important, parse its description or perhaps change author name or even assign some label to it!!!
Each message is accessible in your script via global variable named `msg` of type `MessageObject`, see this [file](https://github.com/martinrotter/rssguard/blob/master/src/librssguard/core/messageobject.h) for the declaration. Some properties are writable, allowing you to change contents of the message before it is written to DB. You can mark message important, parse its description or perhaps change author name or even assign some label to it!!!
RSS Guard 3.8.0+ offers also list of labels assigned to each message. You can therefore do actions in your filtering script based on which labels are assigned to the message. The property is called `assignedLabels` and is array of `Label` objects. Each `Label` in the array offers these properties: `title` (title of the label), `color` (color of the label) and `customId` (account-specific ID of the label). If you change assigned labels to the message, then the change will be eventually synchronized back to server if respective plugin supports it.
RSS Guard also offers list of labels assigned to each message. You can therefore do actions in your filtering script based on which labels are assigned to the message. The property is called `assignedLabels` and is array of `Label` objects. If you change assigned labels to the message, then the change will be eventually synchronized back to server if respective plugin supports it.
Passed message also offers special function
```js
MessageObject.isDuplicateWithAttribute(DuplicationAttributeCheck)
```
which allows you to perform runtime check for existence of the message in RSS Guard's database. The parameter is integer value from enumeration `DuplicationAttributeCheck` from this [file](https://github.com/martinrotter/rssguard/blob/master/src/librssguard/core/message.h) and specifies how exactly you want to determine if given message is "duplicate". Again, you can use direct integer value or enumerant name.
which allows you to perform runtime check for existence of the message in RSS Guard's database. The parameter is integer value from enumeration `DuplicationAttributeCheck` from this [file](https://github.com/martinrotter/rssguard/blob/master/src/librssguard/core/messageobject.h) and specifies how exactly you want to determine if given message is "duplicate". Again, you can use direct integer values or enumerant names.
For example if you want to check if there is already another message with same author in database, then you call `msg.isDuplicateWithAttribute(MessageObject.SameAuthor)`. Enumeration even supports "flags" approach, thus you can combine multiple checks via bitwise `OR` operation in single call, for example like this: `msg.isDuplicateWithAttribute(MessageObject.SameAuthor | MessageObject.SameUrl)`.
@ -49,7 +48,7 @@ Here is the reference of methods and properties of some types available in your
| `String url` | URL of the message. |
| `String author` | Author of the message. |
| `String contents` | Contents of the message. |
| `Number score` | Arbitrary number in range <0.0, 100.0>. You can use this number to order messages in a custom fashion as this attribute also has its own column in messages list. |
| `Number score` | Arbitrary number in range <0.0, 100.0>. You can use this number to sort messages in a custom fashion as this attribute also has its own column in messages list. |
| `Date created` | Date/time of the message. |
| `Boolean isRead` | Is message read? |
| `Boolean isImportant` | Is message important? |
@ -57,7 +56,7 @@ Here is the reference of methods and properties of some types available in your
| `Boolean isDuplicateWithAttribute(DuplicationAttributeCheck)` | Allows you to test if this particular message is already stored in RSS Guard's DB. |
| `Boolean assignLabel(String)` | Assigns label to this message. The passed `String` value is the `customId` property of `Label` type. See its API reference for relevant info. Available in RSS Guard 3.8.1+. |
| `Boolean deassignLabel(String)` | Removes label from this message. The passed `String` value is the `customId` property of `Label` type. See its API reference for relevant info. Available in RSS Guard 3.8.1+. |
| `Boolean alreadyStoredInDb` | `READ-ONLY` Returns true if this message is already stored in DB. This function is the way to check if the filter is being run automatically for newly downloaded messages or manually for already existing messages. Available in RSS Guard 3.8.4+. |
| `Boolean alreadyStoredInDb` | `READ-ONLY` Returns true if this message is already stored in DB. This function is the way to check if the filter is being run automatically for newly downloaded messages or manually for already existing messages.
### `Label` class
| Property/method | Description |

@ -1 +1 @@
Subproject commit 47f4125753452eff8800dbd6600c5a05540b15d9
Subproject commit 9c10723bfbaf6cb85107d6ee16e0324e9e487749