https://wiki.openzim.org/w/index.php?title=Special:NewPages&feed=atom&limit=50&offset=&namespace=0&username=&tagfilter=&size-mode=max&size=0openZIM - New pages [en]2024-03-29T07:01:30ZFrom openZIMMediaWiki 1.36.1https://wiki.openzim.org/wiki/StephaneStephane2023-12-15T12:07:46Z<p>Stephane: Created page with "Yo, let's go."</p>
<hr />
<div>Yo, let's go.</div>Stephanehttps://wiki.openzim.org/wiki/Content_TeamContent Team2023-12-15T11:03:57Z<p>Kelson: /* Goals */</p>
<hr />
<div>The '''Content team''' gathers people in charge of providing books in the ZIM format ("books" being understood here as web content stored as single web archives). <br />
== Purpose ==<br />
<br />
Provide web-based educational content to people without internet access, and make the experience as seamless as possible. Access and discovery must be user-friendly and market ready, the content up-to-date and as portable as can technically be.<br />
<br />
== Goals ==<br />
* Book curation must remain focused on educational material, broadly construed;<br />
* Books should have proper visual formatting;<br />
* Books should be up-to-date like custom apps;<br />
* The Kiwix Library should allow easy and friendly discovery of content.<br />
<br />
== Responsabilities ==<br />
* Content Requests<br />
** Collaborate with requesters to qualify requests properly. Keep them informed.<br />
** Ensure we are allowed and able to fullfill requests<br />
** Initiate new recipes and manage first publishing if new book<br />
** Collaborate with scraper dev. team if necessary<br />
** Keep the tickets up2date<br />
<br />
* Scraping<br />
** Ensure Zimfarm works fine and contribute to its improvements with dev. team<br />
** Analyses failures or unexpected behaviors<br />
** Ensure recipes run properly, fix configuration when necessary and contribute to scraper improvements with dev. team<br />
** Ensure workers are online and are properly configured<br />
** Ensure scrapes lifecycle is correct (Reasonable pipeline size, Running scrapes progressing appropriately, not too many failures)<br />
<br />
* Library management<br />
** Ensure ZIM filenames and location (paths) are correct<br />
** Ensure ZIM Metadata are correct<br />
** Ensure ZIM are recent and kept up2date (AFAP)<br />
** Ensure library is coherent and user-friendly<br />
<br />
== Policies ==<br />
<br />
=== Publishing ===<br />
* Content has to be legal in Switzerland<br />
* Content should not advertise [https://en.wikipedia.org/wiki/Fringe_theory fringe theory]<br />
* Content should betterne [https://en.wikipedia.org/wiki/Free_content free content]<br />
* If not free, content should be:<br />
** Open content OR<br />
** Educational content OR<br />
** has an authorization of reproduction<br />
* Any content we publish should<br />
** have (almost) no user visible error<br />
** have proper/correct metadata<br />
** be easily discoverable in the public library<br />
<br />
=== Content Requests ===<br />
* Allow everybody to request new, changes or deletion of content<br />
* In full transparency track the lifecycle of our content portfolio<br />
* New content should be assessed and vetted content against publishing policy (see above)<br />
* Content requests should be closed:<br />
** when fully implemented (user visible)<br />
** if refusal or impossibility of implementation<br />
* ZIM Medata should be given for new content<br />
* Only once all prerequisites are satisfied, then start with scraping<br />
<br />
=== Scraping ===<br />
* Scraping leadership means the initiative should come from the content team<br />
* First analysis of error should be done by content team<br />
* If error in scraper is suspected<br />
** Issue should be updated to corresponding scraper code repository<br />
** Scraper problem analysis does not super-seed in any manner content request<br />
* ZIM quality should be vetted against publishing policy<br />
* Any recipe should run successfully first in dev before been put in production<br />
* Hardware resources should be saved<br />
<br />
=== Library Management ===<br />
<br />
=== Custom Apps ===<br />
<br />
== Processes ==<br />
<br />
=== Content Requests ===<br />
<br />
=== Scraping ===<br />
<br />
=== Library Management ===<br />
<br />
=== Custom Apps ===<br />
<br />
== Worflows ==<br />
<br />
<br />
## To create a new recipe for youtube files<br />
<br />
**It’s recommended to clone an existing Youtube recipe.**<br />
<br />
* Create the recipe name as per the naming conventions [here](https://github.com/openzim/overview/wiki/Naming-Convention).<br />
* In the Language space, choose the language of the website you are creating the recipe for.<br />
* From Category space, choose (other)<br />
* From warehouse path space, choose (/.hidden/.dev) always as a first time in order to test the resulted file, if the file is tested and all is correct then you update the recipe with the proper path (videos).<br />
* Make sure the Status is set to Enabled.<br />
* You can choose Periodicity to be monthly or quarterly.<br />
* In Offliner space choose: Youtube<br />
* In platform space choose Youtube.<br />
* Keep the rest the same with no change.<br />
<br />
**In Youtube command flags:**<br />
<br />
* In Playlist mode: choose (Not Set) if you are doing the recipe for a whole channel.<br />
* If you are doing the recipe for a playlist, choose (Set).<br />
* In Type: choose (Channel) or (Playlist) as per your required file.<br />
* In Youtube ID: type the ID of the channel or the playlist.<br />
* For the API Key: There is a list of keys mostly as per the channel or the playlists sizes, ask for the list to choose the appropriate API Key.<br />
* In Zim Name: the recipe name as per the naming conventions [here](https://github.com/openzim/overview/wiki/Naming-Convention).<br />
* In Title: type the name you want for the output file.<br />
* Description: type a short description of your required zim file.<br />
* Leave Optimisation Cache URL as it is (cloned from old recipe).<br />
* Leave the rest of the fields empty or as per the cloned recipe.<br />
* Finally, click in the bottom on (Update offliner details).<br />
* Review all your entries once again, then go back to the top of the page and click on (Request).<br />
* After about an hour, check the recipe if it failed or succeeded (or the next day if the source website is large).<br />
* If successful, go to this link ([dev.library.kiwix.org](https://dev.library.kiwix.org/)) and check your created file, check the size and check if the file is working properly. If the file does not appear, wait a bit as updates are made every 15 minutes.<br />
* If the file looks good and complete, go back to your recipe, In warehouse path space, change(/.hidden/.dev) to the proper category related to your file content (Wikipedia, Wikihow, … etc).<br />
* Click on Update offliner details and then click on Request again.<br />
* Finally, check the file in (https://library.kiwix.org/ ). If all is good, do not forget to go back to the initial ticket (most likely at zim-requests) and put the link of the output file and close the ticket.<br />
<br />
== Members ==<br />
* [https://github.com/Popolechien Popolechien], manager in line<br />
* [https://github.com/RavanJAltaie Ravan], content manager<br />
* [https://github.com/benoit74 Benoit74], scrapers lead dev<br />
<br />
== See also ==<br />
* [[Content strategy]]</div>Kelson