"Fossies" - the Fresh Open Source Software Archive

Member "calibre-3.46.0/recipes/10minutos.recipe" (19 Jul 2019, 1988 Bytes) of package /linux/misc/calibre-3.46.0.tar.xz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) Python source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 #!/usr/bin/env python2
    2 ##
    3 # Title:        Diario 10minutos.com.uy News and Sports Calibre Recipe
    4 # Contact:      Carlos Alves - <carlosalves90@gmail.com>
    5 ##
    6 # License:      GNU General Public License v3 - http://www.gnu.org/copyleft/gpl.html
    7 # Copyright:    Carlos Alves - <carlosalves90@gmail.com>
    8 ##
    9 # Written:      September 2013
   10 # Last Edited:  2018-02-13
   11 ##
   12 
   13 __license__ = 'GPL v3'
   14 __author__ = '2016, Carlos Alves <carlosalves90@gmail.com>'
   15 '''
   16 10minutos.com.uy
   17 '''
   18 
   19 from calibre.web.feeds.news import BasicNewsRecipe
   20 
   21 
   22 class General(BasicNewsRecipe):
   23     title = '10minutos'
   24     __author__ = 'Carlos Alves'
   25     description = 'Noticias de Salto - Uruguay'
   26     tags = 'news, sports'
   27     language = 'es_UY'
   28     timefmt = '[%a, %d %b, %Y]'
   29     use_embedded_content = False
   30     recursion = 5
   31     encoding = 'utf8'
   32     remove_javascript = True
   33     no_stylesheets = True
   34 
   35     oldest_article = 2
   36     max_articles_per_feed = 100
   37     keep_only_tags = [dict(name='div', attrs={'class': 'post-content'})]
   38 
   39     remove_tags = [
   40         dict(name='div', attrs={'class': ['hr', 'titlebar', 'navigation']}),
   41         dict(name='div', attrs={'class': 'sharedaddy sd-sharing-enabled'}),
   42         dict(name='p', attrs={'class': 'post-meta'}),
   43         dict(name=['object', 'link'])
   44     ]
   45 
   46     extra_css = '''
   47                 h1{font-family: Georgia,"Times New Roman",Times,serif}
   48                 h3{font-family: Georgia,"Times New Roman",Times,serif}
   49                 h2{font-family: Georgia,"Times New Roman",Times,serif}
   50                 p{font-family: Verdana,Arial,Helvetica,sans-serif}
   51                 body{font-family: Verdana,Arial,Helvetica,sans-serif}
   52                 img{margin-bottom: 0.4em; display:block;}
   53                 '''
   54 
   55     feeds = [
   56         (u'Articulos', u'http://10minutos.com.uy/?feed=rss2')
   57     ]
   58 
   59     def get_cover_url(self):
   60         return None
   61 
   62     def preprocess_html(self, soup):
   63         for item in soup.findAll(style=True):
   64             del item['style']
   65         return soup