Initial PoC for litehtml integration.

This commit is contained in:
Martin Rotter 2022-03-25 14:28:34 +01:00
parent 35b2a1c07b
commit f571a2d01b
171 changed files with 56827 additions and 520 deletions

View File

@ -3,7 +3,7 @@ name: rssguard
on:
push:
branches: ["master", "version-4"]
branches: ["*"]
tags: ["*"]
jobs:

View File

@ -68,7 +68,7 @@ set(APP_URL_ISSUES_NEW "https://github.com/martinrotter/rssguard/issues/new/choo
set(TYPEINFO "????")
project(rssguard VERSION ${APP_VERSION} LANGUAGES CXX)
project(rssguard VERSION ${APP_VERSION} LANGUAGES CXX C)
# Basic C++ related behavior of cmake.
set(CMAKE_CXX_STANDARD 17)

View File

@ -0,0 +1,2 @@
Copyright 2010, 2011 Google Inc.
Copyright 2008-2009 Bjoern Hoehrmann <bjoern@hoehrmann.de>

View File

@ -0,0 +1,24 @@
Copyright (c) 2013, Yuri Kobets (tordex)
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the <organization> nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@ -0,0 +1,38 @@
# What is litehtml?
**litehtml** is the lightweight HTML rendering engine with CSS2/CSS3 support. Note that **litehtml** itself does not draw any text, pictures or other graphics and that **litehtml** does not depend on any image/draw/font library. You are free to use any library to draw images, fonts and any other graphics. **litehtml** just parses HTML/CSS and places the HTML elements into the correct positions (renders HTML). To draw the HTML elements you have to implement the simple callback interface [document_container](https://github.com/litehtml/litehtml/wiki/document_container). This interface is really simple, check it out! The [document_container](https://github.com/litehtml/litehtml/wiki/document_container) implementation is required to render HTML correctly.
# Where litehtml can be used
**litehtml** can be used when you need to show HTML formatted text or even to create a mini-browser, but using it as a full-featured HTML engine is not recommended. Usually you don't need something like WebKit to show simple HTML tooltips or HTML-formatted text, **litehtml** is much better for these as it's more lightweight and easier to integrate into your application.
## HTML Parser
**litehtml** uses the [gumbo-parser](https://github.com/google/gumbo-parser) to parse HTML. Gumbo is an implementation of the HTML5 parsing algorithm implemented as a pure C99 library with no outside dependencies. It's designed to serve as a building block for other tools and libraries such as linters, validators, templating languages, and refactoring and analysis tools.
## Compatibility
**litehtml** is compatible with any platform suported by C++ and STL. For Windows MS Visual Studio 2013 is recommended. **litehtml** supports both UTF-8 and Unicode strings on Windows and UTF-8 strings on Linux and Haiku.
## Support for HTML and CSS standards
Unfortunately **litehtml** is not fully compatible with HTML/CSS standards. There is lots of work to do to make **litehtml** work as well as modern browsers. But **litehtml** supports most HTML tags and CSS properties. You can find the list of supported CSS properties in [this table](https://docs.google.com/spreadsheet/ccc?key=0AvHXl5n24PuhdHdELUdhaUl4OGlncXhDcDJuM1JpMnc&usp=sharing). For most simple usecases the HTML/CSS features supported by **litehtml** are enough. Right now **litehtml** supports even some pages with very complex HTML/CSS designs. As an example the pages created with [bootstrap framework](http://getbootstrap.com/) are usually well formatted by **litehtml**.
## Testing litehtml
You can [download the simple browser](http://www.litehtml.com/download.html) (**litebrowser**) to test the **litehtml** rendering engine.
The litebrowser source codes are available on GitHub:
* [For Windows](https://github.com/litehtml/litebrowser)
* [For Linux](https://github.com/litehtml/litebrowser-linux)
* [For Haiku](https://github.com/adamfowleruk/litebrowser-haiku)
## License
**litehtml** is distributed under [New BSD License](https://opensource.org/licenses/BSD-3-Clause).
The **gumbo-parser** is disributed under [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0)
## Links
* [source code](https://github.com/litehtml/litehtml)
* [website](http://www.litehtml.com/)

View File

@ -0,0 +1,10 @@
#ifndef LITEHTML_H
#define LITEHTML_H
#include <litehtml/html.h>
#include <litehtml/document.h>
#include <litehtml/html_tag.h>
#include <litehtml/stylesheet.h>
#include <litehtml/element.h>
#endif // LITEHTML_H

View File

@ -0,0 +1,35 @@
#ifndef LH_ATTRIBUTES_H
#define LH_ATTRIBUTES_H
namespace litehtml
{
struct attr_color
{
unsigned char rgbBlue;
unsigned char rgbGreen;
unsigned char rgbRed;
unsigned char rgbAlpha;
attr_color()
{
rgbAlpha = 255;
rgbBlue = 0;
rgbGreen = 0;
rgbRed = 0;
}
};
struct attr_border
{
style_border border;
int width;
attr_color color;
attr_border()
{
border = borderNone;
width = 0;
}
};
}
#endif // LH_ATTRIBUTES_H

View File

@ -0,0 +1,58 @@
#ifndef LH_BACKGROUND_H
#define LH_BACKGROUND_H
#include "types.h"
#include "attributes.h"
#include "css_length.h"
#include "css_position.h"
#include "web_color.h"
#include "borders.h"
namespace litehtml
{
class background
{
public:
tstring m_image;
tstring m_baseurl;
web_color m_color;
background_attachment m_attachment;
css_position m_position;
background_repeat m_repeat;
background_box m_clip;
background_box m_origin;
css_border_radius m_radius;
public:
background();
background(const background& val);
~background() = default;
background& operator=(const background& val);
};
class background_paint
{
public:
tstring image;
tstring baseurl;
background_attachment attachment;
background_repeat repeat;
web_color color;
position clip_box;
position origin_box;
position border_box;
border_radiuses border_radius;
size image_size;
int position_x;
int position_y;
bool is_root;
public:
background_paint();
background_paint(const background_paint& val);
background_paint& operator=(const background& val);
};
}
#endif // LH_BACKGROUND_H

View File

@ -0,0 +1,305 @@
#ifndef LH_BORDERS_H
#define LH_BORDERS_H
#include "css_length.h"
#include "types.h"
namespace litehtml
{
struct css_border
{
css_length width;
border_style style;
web_color color;
css_border()
{
style = border_style_none;
}
css_border(const css_border& val)
{
width = val.width;
style = val.style;
color = val.color;
}
css_border& operator=(const css_border& val)
{
width = val.width;
style = val.style;
color = val.color;
return *this;
}
};
struct border
{
int width;
border_style style;
web_color color;
border()
{
width = 0;
}
border(const border& val)
{
width = val.width;
style = val.style;
color = val.color;
}
border(const css_border& val)
{
width = (int) val.width.val();
style = val.style;
color = val.color;
}
border& operator=(const border& val)
{
width = val.width;
style = val.style;
color = val.color;
return *this;
}
border& operator=(const css_border& val)
{
width = (int) val.width.val();
style = val.style;
color = val.color;
return *this;
}
};
struct border_radiuses
{
int top_left_x;
int top_left_y;
int top_right_x;
int top_right_y;
int bottom_right_x;
int bottom_right_y;
int bottom_left_x;
int bottom_left_y;
border_radiuses()
{
top_left_x = 0;
top_left_y = 0;
top_right_x = 0;
top_right_y = 0;
bottom_right_x = 0;
bottom_right_y = 0;
bottom_left_x = 0;
bottom_left_y = 0;
}
border_radiuses(const border_radiuses& val)
{
top_left_x = val.top_left_x;
top_left_y = val.top_left_y;
top_right_x = val.top_right_x;
top_right_y = val.top_right_y;
bottom_right_x = val.bottom_right_x;
bottom_right_y = val.bottom_right_y;
bottom_left_x = val.bottom_left_x;
bottom_left_y = val.bottom_left_y;
}
border_radiuses& operator = (const border_radiuses& val)
{
top_left_x = val.top_left_x;
top_left_y = val.top_left_y;
top_right_x = val.top_right_x;
top_right_y = val.top_right_y;
bottom_right_x = val.bottom_right_x;
bottom_right_y = val.bottom_right_y;
bottom_left_x = val.bottom_left_x;
bottom_left_y = val.bottom_left_y;
return *this;
}
void operator += (const margins& mg)
{
top_left_x += mg.left;
top_left_y += mg.top;
top_right_x += mg.right;
top_right_y += mg.top;
bottom_right_x += mg.right;
bottom_right_y += mg.bottom;
bottom_left_x += mg.left;
bottom_left_y += mg.bottom;
fix_values();
}
void operator -= (const margins& mg)
{
top_left_x -= mg.left;
top_left_y -= mg.top;
top_right_x -= mg.right;
top_right_y -= mg.top;
bottom_right_x -= mg.right;
bottom_right_y -= mg.bottom;
bottom_left_x -= mg.left;
bottom_left_y -= mg.bottom;
fix_values();
}
void fix_values()
{
if (top_left_x < 0) top_left_x = 0;
if (top_left_y < 0) top_left_y = 0;
if (top_right_x < 0) top_right_x = 0;
if (top_right_y < 0) top_right_y = 0;
if (bottom_right_x < 0) bottom_right_x = 0;
if (bottom_right_y < 0) bottom_right_y = 0;
if (bottom_left_x < 0) bottom_left_x = 0;
if (bottom_left_y < 0) bottom_left_y = 0;
}
};
struct css_border_radius
{
css_length top_left_x;
css_length top_left_y;
css_length top_right_x;
css_length top_right_y;
css_length bottom_right_x;
css_length bottom_right_y;
css_length bottom_left_x;
css_length bottom_left_y;
css_border_radius()
{
}
css_border_radius(const css_border_radius& val)
{
top_left_x = val.top_left_x;
top_left_y = val.top_left_y;
top_right_x = val.top_right_x;
top_right_y = val.top_right_y;
bottom_left_x = val.bottom_left_x;
bottom_left_y = val.bottom_left_y;
bottom_right_x = val.bottom_right_x;
bottom_right_y = val.bottom_right_y;
}
css_border_radius& operator=(const css_border_radius& val)
{
top_left_x = val.top_left_x;
top_left_y = val.top_left_y;
top_right_x = val.top_right_x;
top_right_y = val.top_right_y;
bottom_left_x = val.bottom_left_x;
bottom_left_y = val.bottom_left_y;
bottom_right_x = val.bottom_right_x;
bottom_right_y = val.bottom_right_y;
return *this;
}
border_radiuses calc_percents(int width, int height)
{
border_radiuses ret;
ret.bottom_left_x = bottom_left_x.calc_percent(width);
ret.bottom_left_y = bottom_left_y.calc_percent(height);
ret.top_left_x = top_left_x.calc_percent(width);
ret.top_left_y = top_left_y.calc_percent(height);
ret.top_right_x = top_right_x.calc_percent(width);
ret.top_right_y = top_right_y.calc_percent(height);
ret.bottom_right_x = bottom_right_x.calc_percent(width);
ret.bottom_right_y = bottom_right_y.calc_percent(height);
return ret;
}
};
struct css_borders
{
css_border left;
css_border top;
css_border right;
css_border bottom;
css_border_radius radius;
css_borders() = default;
bool is_visible() const
{
return left.width.val() != 0 || right.width.val() != 0 || top.width.val() != 0 || bottom.width.val() != 0;
}
css_borders(const css_borders& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
radius = val.radius;
}
css_borders& operator=(const css_borders& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
radius = val.radius;
return *this;
}
};
struct borders
{
border left;
border top;
border right;
border bottom;
border_radiuses radius;
borders() = default;
borders(const borders& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
radius = val.radius;
}
borders(const css_borders& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
}
bool is_visible() const
{
return left.width != 0 || right.width != 0 || top.width != 0 || bottom.width != 0;
}
borders& operator=(const borders& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
radius = val.radius;
return *this;
}
borders& operator=(const css_borders& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
return *this;
}
};
}
#endif // LH_BORDERS_H

View File

@ -0,0 +1,120 @@
#ifndef LH_BOX_H
#define LH_BOX_H
namespace litehtml
{
class html_tag;
enum box_type
{
box_block,
box_line
};
class box
{
public:
typedef std::unique_ptr<litehtml::box> ptr;
typedef std::vector< box::ptr > vector;
protected:
int m_box_top;
int m_box_left;
int m_box_right;
public:
box(int top, int left, int right)
{
m_box_top = top;
m_box_left = left;
m_box_right = right;
}
virtual ~box() = default;
int bottom() const { return m_box_top + height(); }
int top() const { return m_box_top; }
int right() const { return m_box_left + width(); }
int left() const { return m_box_left; }
virtual litehtml::box_type get_type() const = 0;
virtual int height() const = 0;
virtual int width() const = 0;
virtual void add_element(const element::ptr &el) = 0;
virtual bool can_hold(const element::ptr &el, white_space ws) const = 0;
virtual void finish(bool last_box = false) = 0;
virtual bool is_empty() const = 0;
virtual int baseline() const = 0;
virtual void get_elements(elements_vector& els) = 0;
virtual int top_margin() const = 0;
virtual int bottom_margin() const = 0;
virtual void y_shift(int shift) = 0;
virtual void new_width(int left, int right, elements_vector& els) = 0;
};
//////////////////////////////////////////////////////////////////////////
class block_box : public box
{
element::ptr m_element;
public:
block_box(int top, int left, int right) : box(top, left, right)
{
m_element = nullptr;
}
litehtml::box_type get_type() const override;
int height() const override;
int width() const override;
void add_element(const element::ptr &el) override;
bool can_hold(const element::ptr &el, white_space ws) const override;
void finish(bool last_box = false) override;
bool is_empty() const override;
int baseline() const override;
void get_elements(elements_vector& els) override;
int top_margin() const override;
int bottom_margin() const override;
void y_shift(int shift) override;
void new_width(int left, int right, elements_vector& els) override;
};
//////////////////////////////////////////////////////////////////////////
class line_box : public box
{
elements_vector m_items;
int m_height;
int m_width;
int m_line_height;
font_metrics m_font_metrics;
int m_baseline;
text_align m_text_align;
public:
line_box(int top, int left, int right, int line_height, font_metrics& fm, text_align align) : box(top, left, right)
{
m_height = 0;
m_width = 0;
m_font_metrics = fm;
m_line_height = line_height;
m_baseline = 0;
m_text_align = align;
}
litehtml::box_type get_type() const override;
int height() const override;
int width() const override;
void add_element(const element::ptr &el) override;
bool can_hold(const element::ptr &el, white_space ws) const override;
void finish(bool last_box = false) override;
bool is_empty() const override;
int baseline() const override;
void get_elements(elements_vector& els) override;
int top_margin() const override;
int bottom_margin() const override;
void y_shift(int shift) override;
void new_width(int left, int right, elements_vector& els) override;
private:
bool have_last_space() const;
bool is_break_only() const;
};
}
#endif // LH_BOX_H

View File

@ -0,0 +1,51 @@
// Copyright (C) 2020-2021 Primate Labs Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the names of the copyright holders nor the names of their
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#ifndef LITEHTML_CODEPOINT_H__
#define LITEHTML_CODEPOINT_H__
#include <string>
#include "litehtml/os_types.h"
namespace litehtml {
bool is_ascii_codepoint(litehtml::tchar_t c);
// Returns true if the codepoint is a reserved codepoint for URLs.
// https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
bool is_url_reserved_codepoint(litehtml::tchar_t c);
// Returns true if the codepoint is a scheme codepoint for URLs.
// https://datatracker.ietf.org/doc/html/rfc3986#section-3.1
bool is_url_scheme_codepoint(litehtml::tchar_t c);
} // namespace litehtml
#endif // LITEHTML_CODEPOINT_H__

View File

@ -0,0 +1,20 @@
#ifndef LH_CONTEXT_H
#define LH_CONTEXT_H
#include "stylesheet.h"
namespace litehtml
{
class context
{
litehtml::css m_master_css;
public:
void load_master_stylesheet(const tchar_t* str);
litehtml::css& master_css()
{
return m_master_css;
}
};
}
#endif // LH_CONTEXT_H

View File

@ -0,0 +1,135 @@
#ifndef LH_CSS_LENGTH_H
#define LH_CSS_LENGTH_H
#include "types.h"
namespace litehtml
{
class css_length
{
union
{
float m_value;
int m_predef;
};
css_units m_units;
bool m_is_predefined;
public:
css_length();
css_length(const css_length& val);
css_length& operator=(const css_length& val);
css_length& operator=(float val);
bool is_predefined() const;
void predef(int val);
int predef() const;
void set_value(float val, css_units units);
float val() const;
css_units units() const;
int calc_percent(int width) const;
void fromString(const tstring& str, const tstring& predefs = _t(""), int defValue = 0);
};
// css_length inlines
inline css_length::css_length()
{
m_value = 0;
m_predef = 0;
m_units = css_units_none;
m_is_predefined = false;
}
inline css_length::css_length(const css_length& val)
{
if(val.is_predefined())
{
m_predef = val.m_predef;
} else
{
m_value = val.m_value;
}
m_units = val.m_units;
m_is_predefined = val.m_is_predefined;
}
inline css_length& css_length::operator=(const css_length& val)
{
if(val.is_predefined())
{
m_predef = val.m_predef;
} else
{
m_value = val.m_value;
}
m_units = val.m_units;
m_is_predefined = val.m_is_predefined;
return *this;
}
inline css_length& css_length::operator=(float val)
{
m_value = val;
m_units = css_units_px;
m_is_predefined = false;
return *this;
}
inline bool css_length::is_predefined() const
{
return m_is_predefined;
}
inline void css_length::predef(int val)
{
m_predef = val;
m_is_predefined = true;
}
inline int css_length::predef() const
{
if(m_is_predefined)
{
return m_predef;
}
return 0;
}
inline void css_length::set_value(float val, css_units units)
{
m_value = val;
m_is_predefined = false;
m_units = units;
}
inline float css_length::val() const
{
if(!m_is_predefined)
{
return m_value;
}
return 0;
}
inline css_units css_length::units() const
{
return m_units;
}
inline int css_length::calc_percent(int width) const
{
if(!is_predefined())
{
if(units() == css_units_percentage)
{
return (int) ((double) width * (double) m_value / 100.0);
} else
{
return (int) val();
}
}
return 0;
}
}
#endif // LH_CSS_LENGTH_H

View File

@ -0,0 +1,36 @@
#ifndef LH_CSS_MARGINS_H
#define LH_CSS_MARGINS_H
#include "css_length.h"
namespace litehtml
{
struct css_margins
{
css_length left;
css_length right;
css_length top;
css_length bottom;
css_margins() = default;
css_margins(const css_margins& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
}
css_margins& operator=(const css_margins& val)
{
left = val.left;
right = val.right;
top = val.top;
bottom = val.bottom;
return *this;
}
};
}
#endif // LH_CSS_MARGINS_H

View File

@ -0,0 +1,36 @@
#ifndef LH_CSS_OFFSETS_H
#define LH_CSS_OFFSETS_H
#include "css_length.h"
namespace litehtml
{
struct css_offsets
{
css_length left;
css_length top;
css_length right;
css_length bottom;
css_offsets() = default;
css_offsets(const css_offsets& val)
{
left = val.left;
top = val.top;
right = val.right;
bottom = val.bottom;
}
css_offsets& operator=(const css_offsets& val)
{
left = val.left;
top = val.top;
right = val.right;
bottom = val.bottom;
return *this;
}
};
}
#endif // LH_CSS_OFFSETS_H

View File

@ -0,0 +1,36 @@
#ifndef LH_CSS_POSITION_H
#define LH_CSS_POSITION_H
#include "css_length.h"
namespace litehtml
{
struct css_position
{
css_length x;
css_length y;
css_length width;
css_length height;
css_position() = default;
css_position(const css_position& val)
{
x = val.x;
y = val.y;
width = val.width;
height = val.height;
}
css_position& operator=(const css_position& val)
{
x = val.x;
y = val.y;
width = val.width;
height = val.height;
return *this;
}
};
}
#endif // LH_CSS_POSITION_H

View File

@ -0,0 +1,278 @@
#ifndef LH_CSS_SELECTOR_H
#define LH_CSS_SELECTOR_H
#include "style.h"
#include "media_query.h"
namespace litehtml
{
//////////////////////////////////////////////////////////////////////////
struct selector_specificity
{
int a;
int b;
int c;
int d;
explicit selector_specificity(int va = 0, int vb = 0, int vc = 0, int vd = 0)
{
a = va;
b = vb;
c = vc;
d = vd;
}
void operator += (const selector_specificity& val)
{
a += val.a;
b += val.b;
c += val.c;
d += val.d;
}
bool operator==(const selector_specificity& val) const
{
if(a == val.a && b == val.b && c == val.c && d == val.d)
{
return true;
}
return false;
}
bool operator!=(const selector_specificity& val) const
{
if(a != val.a || b != val.b || c != val.c || d != val.d)
{
return true;
}
return false;
}
bool operator > (const selector_specificity& val) const
{
if(a > val.a)
{
return true;
} else if(a < val.a)
{
return false;
} else
{
if(b > val.b)
{
return true;
} else if(b < val.b)
{
return false;
} else
{
if(c > val.c)
{
return true;
} else if(c < val.c)
{
return false;
} else
{
if(d > val.d)
{
return true;
} else if(d < val.d)
{
return false;
}
}
}
}
return false;
}
bool operator >= (const selector_specificity& val) const
{
if((*this) == val) return true;
if((*this) > val) return true;
return false;
}
bool operator <= (const selector_specificity& val) const
{
if((*this) > val)
{
return false;
}
return true;
}
bool operator < (const selector_specificity& val) const
{
if((*this) <= val && (*this) != val)
{
return true;
}
return false;
}
};
//////////////////////////////////////////////////////////////////////////
enum attr_select_condition
{
select_exists,
select_equal,
select_contain_str,
select_start_str,
select_end_str,
select_pseudo_class,
select_pseudo_element,
};
//////////////////////////////////////////////////////////////////////////
struct css_attribute_selector
{
typedef std::vector<css_attribute_selector> vector;
tstring attribute;
tstring val;
string_vector class_val;
attr_select_condition condition;
css_attribute_selector()
{
condition = select_exists;
}
};
//////////////////////////////////////////////////////////////////////////
class css_element_selector
{
public:
tstring m_tag;
css_attribute_selector::vector m_attrs;
public:
void parse(const tstring& txt);
};
//////////////////////////////////////////////////////////////////////////
enum css_combinator
{
combinator_descendant,
combinator_child,
combinator_adjacent_sibling,
combinator_general_sibling
};
//////////////////////////////////////////////////////////////////////////
class css_selector
{
public:
typedef std::shared_ptr<css_selector> ptr;
typedef std::vector<css_selector::ptr> vector;
public:
selector_specificity m_specificity;
css_element_selector m_right;
css_selector::ptr m_left;
css_combinator m_combinator;
tstring m_style;
int m_order;
media_query_list::ptr m_media_query;
tstring m_baseurl;
public:
explicit css_selector(const media_query_list::ptr& media, const tstring& baseurl)
{
m_media_query = media;
m_baseurl = baseurl;
m_combinator = combinator_descendant;
m_order = 0;
}
~css_selector() = default;
css_selector(const css_selector& val)
{
m_right = val.m_right;
if(val.m_left)
{
m_left = std::make_shared<css_selector>(*val.m_left);
} else
{
m_left = nullptr;
}
m_combinator = val.m_combinator;
m_specificity = val.m_specificity;
m_order = val.m_order;
m_media_query = val.m_media_query;
}
bool parse(const tstring& text);
void calc_specificity();
bool is_media_valid() const;
void add_media_to_doc(document* doc) const;
};
inline bool css_selector::is_media_valid() const
{
if(!m_media_query)
{
return true;
}
return m_media_query->is_used();
}
//////////////////////////////////////////////////////////////////////////
inline bool operator > (const css_selector& v1, const css_selector& v2)
{
if(v1.m_specificity == v2.m_specificity)
{
return (v1.m_order > v2.m_order);
}
return (v1.m_specificity > v2.m_specificity);
}
inline bool operator < (const css_selector& v1, const css_selector& v2)
{
if(v1.m_specificity == v2.m_specificity)
{
return (v1.m_order < v2.m_order);
}
return (v1.m_specificity < v2.m_specificity);
}
inline bool operator >(const css_selector::ptr& v1, const css_selector::ptr& v2)
{
return (*v1 > *v2);
}
inline bool operator < (const css_selector::ptr& v1, const css_selector::ptr& v2)
{
return (*v1 < *v2);
}
//////////////////////////////////////////////////////////////////////////
class used_selector
{
public:
typedef std::unique_ptr<used_selector> ptr;
typedef std::vector<used_selector::ptr> vector;
css_selector::ptr m_selector;
bool m_used;
used_selector(const css_selector::ptr& selector, bool used)
{
m_used = used;
m_selector = selector;
}
};
}
#endif // LH_CSS_SELECTOR_H

View File

@ -0,0 +1,117 @@
#ifndef LH_DOCUMENT_H
#define LH_DOCUMENT_H
#include "style.h"
#include "types.h"
#include "context.h"
namespace litehtml
{
struct css_text
{
typedef std::vector<css_text> vector;
tstring text;
tstring baseurl;
tstring media;
css_text() = default;
css_text(const tchar_t* txt, const tchar_t* url, const tchar_t* media_str)
{
text = txt ? txt : _t("");
baseurl = url ? url : _t("");
media = media_str ? media_str : _t("");
}
css_text(const css_text& val)
{
text = val.text;
baseurl = val.baseurl;
media = val.media;
}
};
class html_tag;
class document : public std::enable_shared_from_this<document>
{
public:
typedef std::shared_ptr<document> ptr;
typedef std::weak_ptr<document> weak_ptr;
private:
std::shared_ptr<element> m_root;
document_container* m_container;
fonts_map m_fonts;
css_text::vector m_css;
litehtml::css m_styles;
litehtml::web_color m_def_color;
litehtml::context* m_context;
litehtml::size m_size;
position::vector m_fixed_boxes;
media_query_list::vector m_media_lists;
element::ptr m_over_element;
elements_vector m_tabular_elements;
media_features m_media;
tstring m_lang;
tstring m_culture;
public:
document(litehtml::document_container* objContainer, litehtml::context* ctx);
virtual ~document();
litehtml::document_container* container() { return m_container; }
uint_ptr get_font(const tchar_t* name, int size, const tchar_t* weight, const tchar_t* style, const tchar_t* decoration, font_metrics* fm);
int render(int max_width, render_type rt = render_all);
void draw(uint_ptr hdc, int x, int y, const position* clip);
web_color get_def_color() { return m_def_color; }
int cvt_units(const tchar_t* str, int fontSize, bool* is_percent = nullptr) const;
int cvt_units(css_length& val, int fontSize, int size = 0) const;
int width() const;
int height() const;
void add_stylesheet(const tchar_t* str, const tchar_t* baseurl, const tchar_t* media);
bool on_mouse_over(int x, int y, int client_x, int client_y, position::vector& redraw_boxes);
bool on_lbutton_down(int x, int y, int client_x, int client_y, position::vector& redraw_boxes);
bool on_lbutton_up(int x, int y, int client_x, int client_y, position::vector& redraw_boxes);
bool on_mouse_leave(position::vector& redraw_boxes);
litehtml::element::ptr create_element(const tchar_t* tag_name, const string_map& attributes);
element::ptr root();
void get_fixed_boxes(position::vector& fixed_boxes);
void add_fixed_box(const position& pos);
void add_media_list(const media_query_list::ptr& list);
bool media_changed();
bool lang_changed();
bool match_lang(const tstring & lang);
void add_tabular(const element::ptr& el);
element::const_ptr get_over_element() const { return m_over_element; }
void append_children_from_string(element& parent, const tchar_t* str);
void append_children_from_utf8(element& parent, const char* str);
static litehtml::document::ptr createFromString(const tchar_t* str, litehtml::document_container* objPainter, litehtml::context* ctx, litehtml::css* user_styles = nullptr);
static litehtml::document::ptr createFromUTF8(const char* str, litehtml::document_container* objPainter, litehtml::context* ctx, litehtml::css* user_styles = nullptr);
private:
litehtml::uint_ptr add_font(const tchar_t* name, int size, const tchar_t* weight, const tchar_t* style, const tchar_t* decoration, font_metrics* fm);
void create_node(void* gnode, elements_vector& elements, bool parseTextNode);
bool update_media_lists(const media_features& features);
void fix_tables_layout();
void fix_table_children(element::ptr& el_ptr, style_display disp, const tchar_t* disp_str);
void fix_table_parent(element::ptr& el_ptr, style_display disp, const tchar_t* disp_str);
};
inline element::ptr document::root()
{
return m_root;
}
inline void document::add_tabular(const element::ptr& el)
{
m_tabular_elements.push_back(el);
}
inline bool document::match_lang(const tstring & lang)
{
return lang == m_lang || lang == m_culture;
}
}
#endif // LH_DOCUMENT_H

View File

@ -0,0 +1,18 @@
#ifndef LH_EL_ANCHOR_H
#define LH_EL_ANCHOR_H
#include "html_tag.h"
namespace litehtml
{
class el_anchor : public html_tag
{
public:
explicit el_anchor(const std::shared_ptr<litehtml::document>& doc);
void on_click() override;
void apply_stylesheet(const litehtml::css& stylesheet) override;
};
}
#endif // LH_EL_ANCHOR_H

View File

@ -0,0 +1,17 @@
#ifndef LH_EL_BASE_H
#define LH_EL_BASE_H
#include "html_tag.h"
namespace litehtml
{
class el_base : public html_tag
{
public:
explicit el_base(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
};
}
#endif // LH_EL_BASE_H

View File

@ -0,0 +1,40 @@
#ifndef LH_EL_BEFORE_AFTER_H
#define LH_EL_BEFORE_AFTER_H
#include "html_tag.h"
namespace litehtml
{
class el_before_after_base : public html_tag
{
public:
el_before_after_base(const std::shared_ptr<litehtml::document>& doc, bool before);
void add_style(const tstring& style, const tstring& baseurl) override;
void apply_stylesheet(const litehtml::css& stylesheet) override;
private:
void add_text(const tstring& txt);
void add_function(const tstring& fnc, const tstring& params);
static tstring convert_escape(const tchar_t* txt);
};
class el_before : public el_before_after_base
{
public:
explicit el_before(const std::shared_ptr<litehtml::document>& doc) : el_before_after_base(doc, true)
{
}
};
class el_after : public el_before_after_base
{
public:
explicit el_after(const std::shared_ptr<litehtml::document>& doc) : el_before_after_base(doc, false)
{
}
};
}
#endif // LH_EL_BEFORE_AFTER_H

View File

@ -0,0 +1,17 @@
#ifndef LH_EL_BODY_H
#define LH_EL_BODY_H
#include "html_tag.h"
namespace litehtml
{
class el_body : public html_tag
{
public:
explicit el_body(const std::shared_ptr<litehtml::document>& doc);
bool is_body() const override;
};
}
#endif // LH_EL_BODY_H

View File

@ -0,0 +1,17 @@
#ifndef LH_EL_BREAK_H
#define LH_EL_BREAK_H
#include "html_tag.h"
namespace litehtml
{
class el_break : public html_tag
{
public:
explicit el_break(const std::shared_ptr<litehtml::document>& doc);
bool is_break() const override;
};
}
#endif // LH_EL_BREAK_H

View File

@ -0,0 +1,19 @@
#ifndef LH_EL_CDATA_H
#define LH_EL_CDATA_H
#include "html_tag.h"
namespace litehtml
{
class el_cdata : public element
{
tstring m_text;
public:
explicit el_cdata(const std::shared_ptr<litehtml::document>& doc);
void get_text(tstring& text) override;
void set_data(const tchar_t* data) override;
};
}
#endif // LH_EL_CDATA_H

View File

@ -0,0 +1,20 @@
#ifndef LH_EL_COMMENT_H
#define LH_EL_COMMENT_H
#include "html_tag.h"
namespace litehtml
{
class el_comment : public element
{
tstring m_text;
public:
explicit el_comment(const std::shared_ptr<litehtml::document>& doc);
bool is_comment() const override;
void get_text(tstring& text) override;
void set_data(const tchar_t* data) override;
};
}
#endif // LH_EL_COMMENT_H

View File

@ -0,0 +1,17 @@
#ifndef LH_EL_DIV_H
#define LH_EL_DIV_H
#include "html_tag.h"
namespace litehtml
{
class el_div : public html_tag
{
public:
explicit el_div(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
};
}
#endif // LH_EL_DIV_H

View File

@ -0,0 +1,17 @@
#ifndef LH_EL_FONT_H
#define LH_EL_FONT_H
#include "html_tag.h"
namespace litehtml
{
class el_font : public html_tag
{
public:
explicit el_font(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
};
}
#endif // LH_EL_FONT_H

View File

@ -0,0 +1,28 @@
#ifndef LH_EL_IMAGE_H
#define LH_EL_IMAGE_H
#include "html_tag.h"
namespace litehtml
{
class el_image : public html_tag
{
tstring m_src;
public:
el_image(const std::shared_ptr<litehtml::document>& doc);
virtual ~el_image(void);
virtual int line_height() const override;
virtual bool is_replaced() const override;
virtual int render(int x, int y, int max_width, bool second_pass = false) override;
virtual void parse_attributes() override;
virtual void parse_styles(bool is_reparse = false) override;
virtual void draw(uint_ptr hdc, int x, int y, const position* clip) override;
virtual void get_content_size(size& sz, int max_width) override;
private:
int calc_max_height(int image_height);
};
}
#endif // LH_EL_IMAGE_H

View File

@ -0,0 +1,20 @@
#ifndef LH_EL_LI_H
#define LH_EL_LI_H
#include "html_tag.h"
namespace litehtml
{
class el_li : public html_tag
{
public:
explicit el_li(const std::shared_ptr<litehtml::document>& doc);
int render(int x, int y, int max_width, bool second_pass = false) override;
private:
bool m_index_initialized = false;
};
}
#endif // LH_EL_LI_H

View File

@ -0,0 +1,18 @@
#ifndef LH_EL_LINK_H
#define LH_EL_LINK_H
#include "html_tag.h"
namespace litehtml
{
class el_link : public html_tag
{
public:
explicit el_link(const std::shared_ptr<litehtml::document>& doc);
protected:
void parse_attributes() override;
};
}
#endif // LH_EL_LINK_H

View File

@ -0,0 +1,18 @@
#ifndef LH_EL_PARA_H
#define LH_EL_PARA_H
#include "html_tag.h"
namespace litehtml
{
class el_para : public html_tag
{
public:
explicit el_para(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
};
}
#endif // LH_EL_PARA_H

View File

@ -0,0 +1,20 @@
#ifndef LH_EL_SCRIPT_H
#define LH_EL_SCRIPT_H
#include "html_tag.h"
namespace litehtml
{
class el_script : public element
{
tstring m_text;
public:
explicit el_script(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
bool appendChild(const ptr &el) override;
const tchar_t* get_tagName() const override;
};
}
#endif // LH_EL_SCRIPT_H

View File

@ -0,0 +1,20 @@
#ifndef LH_EL_SPACE_H
#define LH_EL_SPACE_H
#include "html_tag.h"
#include "el_text.h"
namespace litehtml
{
class el_space : public el_text
{
public:
el_space(const tchar_t* text, const std::shared_ptr<litehtml::document>& doc);
bool is_white_space() const override;
bool is_break() const override;
bool is_space() const override;
};
}
#endif // LH_EL_SPACE_H

View File

@ -0,0 +1,20 @@
#ifndef LH_EL_STYLE_H
#define LH_EL_STYLE_H
#include "html_tag.h"
namespace litehtml
{
class el_style : public element
{
elements_vector m_children;
public:
explicit el_style(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
bool appendChild(const ptr &el) override;
const tchar_t* get_tagName() const override;
};
}
#endif // LH_EL_STYLE_H

View File

@ -0,0 +1,26 @@
#ifndef LH_EL_TABLE_H
#define LH_EL_TABLE_H
#include "html_tag.h"
namespace litehtml
{
struct col_info
{
int width;
bool is_auto;
};
class el_table : public html_tag
{
public:
explicit el_table(const std::shared_ptr<litehtml::document>& doc);
bool appendChild(const litehtml::element::ptr& el) override;
void parse_styles(bool is_reparse = false) override;
void parse_attributes() override;
};
}
#endif // LH_EL_TABLE_H

View File

@ -0,0 +1,17 @@
#ifndef LH_EL_TD_H
#define LH_EL_TD_H
#include "html_tag.h"
namespace litehtml
{
class el_td : public html_tag
{
public:
explicit el_td(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
};
}
#endif // LH_EL_TD_H

View File

@ -0,0 +1,37 @@
#ifndef LH_EL_TEXT_H
#define LH_EL_TEXT_H
#include "html_tag.h"
namespace litehtml
{
class el_text : public element
{
protected:
tstring m_text;
tstring m_transformed_text;
size m_size;
text_transform m_text_transform;
bool m_use_transformed;
bool m_draw_spaces;
public:
el_text(const tchar_t* text, const std::shared_ptr<litehtml::document>& doc);
void get_text(tstring& text) override;
const tchar_t* get_style_property(const tchar_t* name, bool inherited, const tchar_t* def = nullptr) const override;
void parse_styles(bool is_reparse) override;
int get_base_line() override;
void draw(uint_ptr hdc, int x, int y, const position* clip) override;
int line_height() const override;
uint_ptr get_font(font_metrics* fm = nullptr) override;
style_display get_display() const override;
white_space get_white_space() const override;
element_position get_element_position(css_offsets* offsets = nullptr) const override;
css_offsets get_css_offsets() const override;
protected:
void get_content_size(size& sz, int max_width) override;
};
}
#endif // LH_EL_TEXT_H

View File

@ -0,0 +1,18 @@
#ifndef LH_EL_TITLE_H
#define LH_EL_TITLE_H
#include "html_tag.h"
namespace litehtml
{
class el_title : public html_tag
{
public:
explicit el_title(const std::shared_ptr<litehtml::document>& doc);
protected:
void parse_attributes() override;
};
}
#endif // LH_EL_TITLE_H

View File

@ -0,0 +1,18 @@
#ifndef LH_EL_TR_H
#define LH_EL_TR_H
#include "html_tag.h"
namespace litehtml
{
class el_tr : public html_tag
{
public:
explicit el_tr(const std::shared_ptr<litehtml::document>& doc);
void parse_attributes() override;
void get_inline_boxes(position::vector& boxes) override;
};
}
#endif // LH_EL_TR_H

View File

@ -0,0 +1,408 @@
#ifndef LH_ELEMENT_H
#define LH_ELEMENT_H
#include <memory>
#include "stylesheet.h"
#include "css_offsets.h"
namespace litehtml
{
class box;
class element : public std::enable_shared_from_this<element>
{
friend class block_box;
friend class line_box;
friend class html_tag;
friend class el_table;
friend class document;
public:
typedef std::shared_ptr<litehtml::element> ptr;
typedef std::shared_ptr<const litehtml::element> const_ptr;
typedef std::weak_ptr<litehtml::element> weak_ptr;
protected:
std::weak_ptr<element> m_parent;
std::weak_ptr<litehtml::document> m_doc;
litehtml::box* m_box;
elements_vector m_children;
position m_pos;
margins m_margins;
margins m_padding;
margins m_borders;
bool m_skip;
virtual void select_all(const css_selector& selector, elements_vector& res);
public:
explicit element(const std::shared_ptr<litehtml::document>& doc);
virtual ~element() = default;
// returns refer to m_pos member;
position& get_position();
int left() const;
int right() const;
int top() const;
int bottom() const;
int height() const;
int width() const;
int content_margins_top() const;
int content_margins_bottom() const;
int content_margins_left() const;
int content_margins_right() const;
int content_margins_width() const;
int content_margins_height() const;
int margin_top() const;
int margin_bottom() const;
int margin_left() const;
int margin_right() const;
margins get_margins() const;
int padding_top() const;
int padding_bottom() const;
int padding_left() const;
int padding_right() const;
margins get_paddings() const;
int border_top() const;
int border_bottom() const;
int border_left() const;
int border_right() const;
margins get_borders() const;
bool in_normal_flow() const;
litehtml::web_color get_color(const tchar_t* prop_name, bool inherited, const litehtml::web_color& def_color = litehtml::web_color());
bool is_inline_box() const;
position get_placement() const;
bool collapse_top_margin() const;
bool collapse_bottom_margin() const;
bool is_positioned() const;
bool skip() const;
void skip(bool val);
bool have_parent() const;
element::ptr parent() const;
void parent(const element::ptr& par);
bool is_visible() const;
int calc_width(int defVal) const;
int get_inline_shift_left();
int get_inline_shift_right();
void apply_relative_shift(int parent_width);
// returns true for elements inside a table (but outside cells) that don't participate in table rendering
bool is_table_skip() const;
std::shared_ptr<document> get_document() const;
virtual elements_vector select_all(const tstring& selector);
virtual elements_vector select_all(const css_selector& selector);
virtual element::ptr select_one(const tstring& selector);
virtual element::ptr select_one(const css_selector& selector);
virtual int render(int x, int y, int max_width, bool second_pass = false);
virtual int render_inline(const ptr &container, int max_width);
virtual int place_element(const ptr &el, int max_width);
virtual void calc_outlines( int parent_width );
virtual void calc_auto_margins(int parent_width);
virtual void apply_vertical_align();
virtual bool fetch_positioned();
virtual void render_positioned(render_type rt = render_all);
virtual bool appendChild(const ptr &el);
virtual bool removeChild(const ptr &el);
virtual void clearRecursive();
virtual const tchar_t* get_tagName() const;
virtual void set_tagName(const tchar_t* tag);
virtual void set_data(const tchar_t* data);
virtual element_float get_float() const;
virtual vertical_align get_vertical_align() const;
virtual element_clear get_clear() const;
virtual size_t get_children_count() const;
virtual element::ptr get_child(int idx) const;
virtual overflow get_overflow() const;
virtual css_length get_css_left() const;
virtual css_length get_css_right() const;
virtual css_length get_css_top() const;
virtual css_length get_css_bottom() const;
virtual css_offsets get_css_offsets() const;
virtual css_length get_css_width() const;
virtual void set_css_width(css_length& w);
virtual css_length get_css_height() const;
virtual void set_attr(const tchar_t* name, const tchar_t* val);
virtual const tchar_t* get_attr(const tchar_t* name, const tchar_t* def = nullptr) const;
virtual void apply_stylesheet(const litehtml::css& stylesheet);
virtual void refresh_styles();
virtual bool is_white_space() const;
virtual bool is_space() const;
virtual bool is_comment() const;
virtual bool is_body() const;
virtual bool is_break() const;
virtual int get_base_line();
virtual bool on_mouse_over();
virtual bool on_mouse_leave();
virtual bool on_lbutton_down();
virtual bool on_lbutton_up();
virtual void on_click();
virtual bool find_styles_changes(position::vector& redraw_boxes, int x, int y);
virtual const tchar_t* get_cursor();
virtual void init_font();
virtual bool is_point_inside(int x, int y);
virtual bool set_pseudo_class(const tchar_t* pclass, bool add);
virtual bool set_class(const tchar_t* pclass, bool add);
virtual bool is_replaced() const;
virtual int line_height() const;
virtual white_space get_white_space() const;
virtual style_display get_display() const;
virtual visibility get_visibility() const;
virtual element_position get_element_position(css_offsets* offsets = nullptr) const;
virtual void get_inline_boxes(position::vector& boxes);
virtual void parse_styles(bool is_reparse = false);
virtual void draw(uint_ptr hdc, int x, int y, const position* clip);
virtual void draw_background( uint_ptr hdc, int x, int y, const position* clip );
virtual const tchar_t* get_style_property(const tchar_t* name, bool inherited, const tchar_t* def = nullptr) const;
virtual uint_ptr get_font(font_metrics* fm = nullptr);
virtual int get_font_size() const;
virtual void get_text(tstring& text);
virtual void parse_attributes();
virtual int select(const css_selector& selector, bool apply_pseudo = true);
virtual int select(const css_element_selector& selector, bool apply_pseudo = true);
virtual element::ptr find_ancestor(const css_selector& selector, bool apply_pseudo = true, bool* is_pseudo = nullptr);
virtual bool is_ancestor(const ptr &el) const;
virtual element::ptr find_adjacent_sibling(const element::ptr& el, const css_selector& selector, bool apply_pseudo = true, bool* is_pseudo = nullptr);
virtual element::ptr find_sibling(const element::ptr& el, const css_selector& selector, bool apply_pseudo = true, bool* is_pseudo = nullptr);
virtual bool is_first_child_inline(const element::ptr& el) const;
virtual bool is_last_child_inline(const element::ptr& el);
virtual bool have_inline_child() const;
virtual void get_content_size(size& sz, int max_width);
virtual void init();
virtual bool is_floats_holder() const;
virtual int get_floats_height(element_float el_float = float_none) const;
virtual int get_left_floats_height() const;
virtual int get_right_floats_height() const;
virtual int get_line_left(int y);
virtual int get_line_right(int y, int def_right);
virtual void get_line_left_right(int y, int def_right, int& ln_left, int& ln_right);
virtual void add_float(const ptr &el, int x, int y);
virtual void update_floats(int dy, const ptr &parent);
virtual void add_positioned(const ptr &el);
virtual int find_next_line_top(int top, int width, int def_right);
virtual int get_zindex() const;
virtual void draw_stacking_context(uint_ptr hdc, int x, int y, const position* clip, bool with_positioned);
virtual void draw_children( uint_ptr hdc, int x, int y, const position* clip, draw_flag flag, int zindex );
virtual bool is_nth_child(const element::ptr& el, int num, int off, bool of_type) const;
virtual bool is_nth_last_child(const element::ptr& el, int num, int off, bool of_type) const;
virtual bool is_only_child(const element::ptr& el, bool of_type) const;
virtual bool get_predefined_height(int& p_height) const;
virtual void calc_document_size(litehtml::size& sz, int x = 0, int y = 0);
virtual void get_redraw_box(litehtml::position& pos, int x = 0, int y = 0);
virtual void add_style(const tstring& style, const tstring& baseurl);
virtual element::ptr get_element_by_point(int x, int y, int client_x, int client_y);
virtual element::ptr get_child_by_point(int x, int y, int client_x, int client_y, draw_flag flag, int zindex);
virtual const background* get_background(bool own_only = false);
};
//////////////////////////////////////////////////////////////////////////
// INLINE FUNCTIONS //
//////////////////////////////////////////////////////////////////////////
inline int litehtml::element::right() const
{
return left() + width();
}
inline int litehtml::element::left() const
{
return m_pos.left() - margin_left() - m_padding.left - m_borders.left;
}
inline int litehtml::element::top() const
{
return m_pos.top() - margin_top() - m_padding.top - m_borders.top;
}
inline int litehtml::element::bottom() const
{
return top() + height();
}
inline int litehtml::element::height() const
{
return m_pos.height + margin_top() + margin_bottom() + m_padding.height() + m_borders.height();
}
inline int litehtml::element::width() const
{
return m_pos.width + margin_left() + margin_right() + m_padding.width() + m_borders.width();
}
inline int litehtml::element::content_margins_top() const
{
return margin_top() + m_padding.top + m_borders.top;
}
inline int litehtml::element::content_margins_bottom() const
{
return margin_bottom() + m_padding.bottom + m_borders.bottom;
}
inline int litehtml::element::content_margins_left() const
{
return margin_left() + m_padding.left + m_borders.left;
}
inline int litehtml::element::content_margins_right() const
{
return margin_right() + m_padding.right + m_borders.right;
}
inline int litehtml::element::content_margins_width() const
{
return content_margins_left() + content_margins_right();
}
inline int litehtml::element::content_margins_height() const
{
return content_margins_top() + content_margins_bottom();
}
inline litehtml::margins litehtml::element::get_paddings() const
{
return m_padding;
}
inline litehtml::margins litehtml::element::get_borders() const
{
return m_borders;
}
inline int litehtml::element::padding_top() const
{
return m_padding.top;
}
inline int litehtml::element::padding_bottom() const
{
return m_padding.bottom;
}
inline int litehtml::element::padding_left() const
{
return m_padding.left;
}
inline int litehtml::element::padding_right() const
{
return m_padding.right;
}
inline bool litehtml::element::in_normal_flow() const
{
if(get_element_position() != element_position_absolute && get_display() != display_none)
{
return true;
}
return false;
}
inline int litehtml::element::border_top() const
{
return m_borders.top;
}
inline int litehtml::element::border_bottom() const
{
return m_borders.bottom;
}
inline int litehtml::element::border_left() const
{
return m_borders.left;
}
inline int litehtml::element::border_right() const
{
return m_borders.right;
}
inline bool litehtml::element::skip() const
{
return m_skip;
}
inline void litehtml::element::skip(bool val)
{
m_skip = val;
}
inline bool litehtml::element::have_parent() const
{
return !m_parent.expired();
}
inline element::ptr litehtml::element::parent() const
{
return m_parent.lock();
}
inline void litehtml::element::parent(const element::ptr& par)
{
m_parent = par;
}
inline int litehtml::element::margin_top() const
{
return m_margins.top;
}
inline int litehtml::element::margin_bottom() const
{
return m_margins.bottom;
}
inline int litehtml::element::margin_left() const
{
return m_margins.left;
}
inline int litehtml::element::margin_right() const
{
return m_margins.right;
}
inline litehtml::margins litehtml::element::get_margins() const
{
margins ret;
ret.left = margin_left();
ret.right = margin_right();
ret.top = margin_top();
ret.bottom = margin_bottom();
return ret;
}
inline bool litehtml::element::is_positioned() const
{
return (get_element_position() > element_position_static);
}
inline bool litehtml::element::is_visible() const
{
return !(m_skip || get_display() == display_none || get_visibility() != visibility_visible);
}
inline position& litehtml::element::get_position()
{
return m_pos;
}
inline std::shared_ptr<document> element::get_document() const
{
return m_doc.lock();
}
}
#endif // LH_ELEMENT_H

View File

@ -0,0 +1,117 @@
#ifndef LH_HTML_H
#define LH_HTML_H
#include <stdlib.h>
#include <string>
#include <ctype.h>
#include <vector>
#include <map>
#include <cstring>
#include <algorithm>
#include <sstream>
#include <functional>
#include "os_types.h"
#include "types.h"
#include "background.h"
#include "borders.h"
#include "html_tag.h"
#include "web_color.h"
#include "media_query.h"
namespace litehtml
{
struct list_marker
{
tstring image;
const tchar_t* baseurl;
list_style_type marker_type;
web_color color;
position pos;
int index;
uint_ptr font;
};
// call back interface to draw text, images and other elements
class document_container
{
public:
virtual litehtml::uint_ptr create_font(const litehtml::tchar_t* faceName, int size, int weight, litehtml::font_style italic, unsigned int decoration, litehtml::font_metrics* fm) = 0;
virtual void delete_font(litehtml::uint_ptr hFont) = 0;
virtual int text_width(const litehtml::tchar_t* text, litehtml::uint_ptr hFont) = 0;
virtual void draw_text(litehtml::uint_ptr hdc, const litehtml::tchar_t* text, litehtml::uint_ptr hFont, litehtml::web_color color, const litehtml::position& pos) = 0;
virtual int pt_to_px(int pt) const = 0;
virtual int get_default_font_size() const = 0;
virtual const litehtml::tchar_t* get_default_font_name() const = 0;
virtual void draw_list_marker(litehtml::uint_ptr hdc, const litehtml::list_marker& marker) = 0;
virtual void load_image(const litehtml::tchar_t* src, const litehtml::tchar_t* baseurl, bool redraw_on_ready) = 0;
virtual void get_image_size(const litehtml::tchar_t* src, const litehtml::tchar_t* baseurl, litehtml::size& sz) = 0;
virtual void draw_background(litehtml::uint_ptr hdc, const litehtml::background_paint& bg) = 0;
virtual void draw_borders(litehtml::uint_ptr hdc, const litehtml::borders& borders, const litehtml::position& draw_pos, bool root) = 0;
virtual void set_caption(const litehtml::tchar_t* caption) = 0;
virtual void set_base_url(const litehtml::tchar_t* base_url) = 0;
virtual void link(const std::shared_ptr<litehtml::document>& doc, const litehtml::element::ptr& el) = 0;
virtual void on_anchor_click(const litehtml::tchar_t* url, const litehtml::element::ptr& el) = 0;
virtual void set_cursor(const litehtml::tchar_t* cursor) = 0;
virtual void transform_text(litehtml::tstring& text, litehtml::text_transform tt) = 0;
virtual void import_css(litehtml::tstring& text, const litehtml::tstring& url, litehtml::tstring& baseurl) = 0;
virtual void set_clip(const litehtml::position& pos, const litehtml::border_radiuses& bdr_radius, bool valid_x, bool valid_y) = 0;
virtual void del_clip() = 0;
virtual void get_client_rect(litehtml::position& client) const = 0;
virtual std::shared_ptr<litehtml::element> create_element(const litehtml::tchar_t *tag_name,
const litehtml::string_map &attributes,
const std::shared_ptr<litehtml::document> &doc) = 0;
virtual void get_media_features(litehtml::media_features& media) const = 0;
virtual void get_language(litehtml::tstring& language, litehtml::tstring & culture) const = 0;
virtual litehtml::tstring resolve_color(const litehtml::tstring& /*color*/) const { return litehtml::tstring(); }
virtual void split_text(const char* text, const std::function<void(const tchar_t*)>& on_word, const std::function<void(const tchar_t*)>& on_space);
protected:
~document_container() = default;
};
void trim(tstring &s);
void lcase(tstring &s);
int value_index(const tstring& val, const tstring& strings, int defValue = -1, tchar_t delim = _t(';'));
bool value_in_list(const tstring& val, const tstring& strings, tchar_t delim = _t(';'));
tstring::size_type find_close_bracket(const tstring &s, tstring::size_type off, tchar_t open_b = _t('('), tchar_t close_b = _t(')'));
void split_string(const tstring& str, string_vector& tokens, const tstring& delims, const tstring& delims_preserve = _t(""), const tstring& quote = _t("\""));
void join_string(tstring& str, const string_vector& tokens, const tstring& delims);
double t_strtod(const tchar_t* string, tchar_t** endPtr);
int t_strcasecmp(const tchar_t *s1, const tchar_t *s2);
int t_strncasecmp(const tchar_t *s1, const tchar_t *s2, size_t n);
inline int t_isdigit(int c)
{
return (c >= '0' && c <= '9');
}
inline int t_tolower(int c)
{
return (c >= 'A' && c <= 'Z' ? c + 'a' - 'A' : c);
}
inline int round_f(float val)
{
int int_val = (int) val;
if(val - int_val >= 0.5)
{
int_val++;
}
return int_val;
}
inline int round_d(double val)
{
int int_val = (int) val;
if(val - int_val >= 0.5)
{
int_val++;
}
return int_val;
}
}
#endif // LH_HTML_H

View File

@ -0,0 +1,248 @@
#ifndef LH_HTML_TAG_H
#define LH_HTML_TAG_H
#include "element.h"
#include "style.h"
#include "background.h"
#include "css_margins.h"
#include "borders.h"
#include "css_selector.h"
#include "stylesheet.h"
#include "box.h"
#include "table.h"
namespace litehtml
{
struct line_context
{
int calculatedTop;
int top;
int left;
int right;
int width() const
{
return right - left;
}
void fix_top()
{
calculatedTop = top;
}
};
class html_tag : public element
{
friend class elements_iterator;
friend class el_table;
friend class table_grid;
friend class block_box;
friend class line_box;
public:
typedef std::shared_ptr<litehtml::html_tag> ptr;
protected:
box::vector m_boxes;
string_vector m_class_values;
tstring m_tag;
litehtml::style m_style;
string_map m_attrs;
vertical_align m_vertical_align;
text_align m_text_align;
style_display m_display;
list_style_type m_list_style_type;
list_style_position m_list_style_position;
white_space m_white_space;
element_float m_float;
element_clear m_clear;
floated_box::vector m_floats_left;
floated_box::vector m_floats_right;
elements_vector m_positioned;
background m_bg;
element_position m_el_position;
int m_line_height;
bool m_lh_predefined;
string_vector m_pseudo_classes;
used_selector::vector m_used_styles;
uint_ptr m_font;
int m_font_size;
font_metrics m_font_metrics;
css_margins m_css_margins;
css_margins m_css_padding;
css_borders m_css_borders;
css_length m_css_width;
css_length m_css_height;
css_length m_css_min_width;
css_length m_css_min_height;
css_length m_css_max_width;
css_length m_css_max_height;
css_offsets m_css_offsets;
css_length m_css_text_indent;
overflow m_overflow;
visibility m_visibility;
int m_z_index;
box_sizing m_box_sizing;
int_int_cache m_cahe_line_left;
int_int_cache m_cahe_line_right;
// data for table rendering
std::unique_ptr<table_grid> m_grid;
css_length m_css_border_spacing_x;
css_length m_css_border_spacing_y;
int m_border_spacing_x;
int m_border_spacing_y;
border_collapse m_border_collapse;
void select_all(const css_selector& selector, elements_vector& res) override;
public:
explicit html_tag(const std::shared_ptr<litehtml::document>& doc);
/* render functions */
int render(int x, int y, int max_width, bool second_pass = false) override;
int render_inline(const element::ptr &container, int max_width) override;
int place_element(const element::ptr &el, int max_width) override;
bool fetch_positioned() override;
void render_positioned(render_type rt = render_all) override;
int new_box(const element::ptr &el, int max_width, line_context& line_ctx);
int get_cleared_top(const element::ptr &el, int line_top) const;
int finish_last_box(bool end_of_render = false);
bool appendChild(const element::ptr &el) override;
bool removeChild(const element::ptr &el) override;
void clearRecursive() override;
const tchar_t* get_tagName() const override;
void set_tagName(const tchar_t* tag) override;
void set_data(const tchar_t* data) override;
element_float get_float() const override;
vertical_align get_vertical_align() const override;
css_length get_css_left() const override;
css_length get_css_right() const override;
css_length get_css_top() const override;
css_length get_css_bottom() const override;
css_length get_css_width() const override;
css_offsets get_css_offsets() const override;
void set_css_width(css_length& w) override;
css_length get_css_height() const override;
element_clear get_clear() const override;
size_t get_children_count() const override;
element::ptr get_child(int idx) const override;
element_position get_element_position(css_offsets* offsets = nullptr) const override;
overflow get_overflow() const override;
void set_attr(const tchar_t* name, const tchar_t* val) override;
const tchar_t* get_attr(const tchar_t* name, const tchar_t* def = nullptr) const override;
void apply_stylesheet(const litehtml::css& stylesheet) override;
void refresh_styles() override;
bool is_white_space() const override;
bool is_body() const override;
bool is_break() const override;
int get_base_line() override;
bool on_mouse_over() override;
bool on_mouse_leave() override;
bool on_lbutton_down() override;
bool on_lbutton_up() override;
void on_click() override;
bool find_styles_changes(position::vector& redraw_boxes, int x, int y) override;
const tchar_t* get_cursor() override;
void init_font() override;
bool set_pseudo_class(const tchar_t* pclass, bool add) override;
bool set_class(const tchar_t* pclass, bool add) override;
bool is_replaced() const override;
int line_height() const override;
white_space get_white_space() const override;
style_display get_display() const override;
visibility get_visibility() const override;
void parse_styles(bool is_reparse = false) override;
void draw(uint_ptr hdc, int x, int y, const position* clip) override;
void draw_background(uint_ptr hdc, int x, int y, const position* clip) override;
const tchar_t* get_style_property(const tchar_t* name, bool inherited, const tchar_t* def = nullptr) const override;
uint_ptr get_font(font_metrics* fm = nullptr) override;
int get_font_size() const override;
elements_vector& children();
void calc_outlines(int parent_width) override;
void calc_auto_margins(int parent_width) override;
int select(const css_selector& selector, bool apply_pseudo = true) override;
int select(const css_element_selector& selector, bool apply_pseudo = true) override;
elements_vector select_all(const tstring& selector) override;
elements_vector select_all(const css_selector& selector) override;
element::ptr select_one(const tstring& selector) override;
element::ptr select_one(const css_selector& selector) override;
element::ptr find_ancestor(const css_selector& selector, bool apply_pseudo = true, bool* is_pseudo = nullptr) override;
element::ptr find_adjacent_sibling(const element::ptr& el, const css_selector& selector, bool apply_pseudo = true, bool* is_pseudo = nullptr) override;
element::ptr find_sibling(const element::ptr& el, const css_selector& selector, bool apply_pseudo = true, bool* is_pseudo = nullptr) override;
void get_text(tstring& text) override;
void parse_attributes() override;
bool is_first_child_inline(const element::ptr& el) const override;
bool is_last_child_inline(const element::ptr& el) override;
bool have_inline_child() const override;
void get_content_size(size& sz, int max_width) override;
void init() override;
void get_inline_boxes(position::vector& boxes) override;
bool is_floats_holder() const override;
int get_floats_height(element_float el_float = float_none) const override;
int get_left_floats_height() const override;
int get_right_floats_height() const override;
int get_line_left(int y) override;
int get_line_right(int y, int def_right) override;
void get_line_left_right(int y, int def_right, int& ln_left, int& ln_right) override;
void add_float(const element::ptr &el, int x, int y) override;
void update_floats(int dy, const element::ptr &parent) override;
void add_positioned(const element::ptr &el) override;
int find_next_line_top(int top, int width, int def_right) override;
void apply_vertical_align() override;
void draw_children(uint_ptr hdc, int x, int y, const position* clip, draw_flag flag, int zindex) override;
int get_zindex() const override;
void draw_stacking_context(uint_ptr hdc, int x, int y, const position* clip, bool with_positioned) override;
void calc_document_size(litehtml::size& sz, int x = 0, int y = 0) override;
void get_redraw_box(litehtml::position& pos, int x = 0, int y = 0) override;
void add_style(const tstring& style, const tstring& baseurl) override;
element::ptr get_element_by_point(int x, int y, int client_x, int client_y) override;
element::ptr get_child_by_point(int x, int y, int client_x, int client_y, draw_flag flag, int zindex) override;
bool is_nth_child(const element::ptr& el, int num, int off, bool of_type) const override;
bool is_nth_last_child(const element::ptr& el, int num, int off, bool of_type) const override;
bool is_only_child(const element::ptr& el, bool of_type) const override;
const background* get_background(bool own_only = false) override;
protected:
void draw_children_box(uint_ptr hdc, int x, int y, const position* clip, draw_flag flag, int zindex);
void draw_children_table(uint_ptr hdc, int x, int y, const position* clip, draw_flag flag, int zindex);
int render_box(int x, int y, int max_width, bool second_pass = false);
int render_table(int x, int y, int max_width, bool second_pass = false);
int fix_line_width(int max_width, element_float flt);
void parse_background();
void init_background_paint( position pos, background_paint &bg_paint, const background* bg );
void draw_list_marker( uint_ptr hdc, const position &pos );
tstring get_list_marker_text(int index);
static void parse_nth_child_params( const tstring& param, int &num, int &off );
void remove_before_after();
litehtml::element::ptr get_element_before();
litehtml::element::ptr get_element_after();
};
/************************************************************************/
/* Inline Functions */
/************************************************************************/
inline elements_vector& litehtml::html_tag::children()
{
return m_children;
}
}
#endif // LH_HTML_TAG_H

View File

@ -0,0 +1,89 @@
#ifndef LH_ITERATORS_H
#define LH_ITERATORS_H
#include "types.h"
namespace litehtml
{
class element;
class iterator_selector
{
public:
virtual bool select(const element::ptr& el) = 0;
protected:
~iterator_selector() = default;
};
class elements_iterator
{
private:
struct stack_item
{
int idx;
element::ptr el;
stack_item() : idx(0)
{
}
stack_item(const stack_item& val)
{
idx = val.idx;
el = val.el;
}
stack_item(stack_item&& val)
{
idx = val.idx;
el = std::move(val.el);
}
};
std::vector<stack_item> m_stack;
element::ptr m_el;
int m_idx;
iterator_selector* m_go_inside;
iterator_selector* m_select;
public:
elements_iterator(const element::ptr& el, iterator_selector* go_inside, iterator_selector* select)
{
m_el = el;
m_idx = -1;
m_go_inside = go_inside;
m_select = select;
}
~elements_iterator() = default;
element::ptr next(bool ret_parent = true);
private:
void next_idx();
};
class go_inside_inline final : public iterator_selector
{
public:
bool select(const element::ptr& el) override;
};
class go_inside_table final : public iterator_selector
{
public:
bool select(const element::ptr& el) override;
};
class table_rows_selector final : public iterator_selector
{
public:
bool select(const element::ptr& el) override;
};
class table_cells_selector final : public iterator_selector
{
public:
bool select(const element::ptr& el) override;
};
}
#endif // LH_ITERATORS_H

View File

@ -0,0 +1,77 @@
#ifndef LH_MEDIA_QUERY_H
#define LH_MEDIA_QUERY_H
namespace litehtml
{
struct media_query_expression
{
typedef std::vector<media_query_expression> vector;
media_feature feature;
int val;
int val2;
bool check_as_bool;
media_query_expression()
{
check_as_bool = false;
feature = media_feature_none;
val = 0;
val2 = 0;
}
bool check(const media_features& features) const;
};
class media_query
{
public:
typedef std::shared_ptr<media_query> ptr;
typedef std::vector<media_query::ptr> vector;
private:
media_query_expression::vector m_expressions;
bool m_not;
media_type m_media_type;
public:
media_query();
media_query(const media_query& val);
static media_query::ptr create_from_string(const tstring& str, const std::shared_ptr<document>& doc);
bool check(const media_features& features) const;
};
class media_query_list
{
public:
typedef std::shared_ptr<media_query_list> ptr;
typedef std::vector<media_query_list::ptr> vector;
private:
media_query::vector m_queries;
bool m_is_used;
public:
media_query_list();
media_query_list(const media_query_list& val);
static media_query_list::ptr create_from_string(const tstring& str, const std::shared_ptr<document>& doc);
bool is_used() const;
bool apply_media_features(const media_features& features); // returns true if the m_is_used changed
};
inline media_query_list::media_query_list(const media_query_list& val)
{
m_is_used = val.m_is_used;
m_queries = val.m_queries;
}
inline media_query_list::media_query_list()
{
m_is_used = false;
}
inline bool media_query_list::is_used() const
{
return m_is_used;
}
}
#endif // LH_MEDIA_QUERY_H

View File

@ -0,0 +1,19 @@
#ifndef NUM_CVT_H
#define NUM_CVT_H
#include <string>
#include "os_types.h"
namespace litehtml
{
namespace num_cvt
{
litehtml::tstring to_latin_lower(int val);
litehtml::tstring to_latin_upper(int val);
litehtml::tstring to_greek_lower(int val);
litehtml::tstring to_roman_lower(int value);
litehtml::tstring to_roman_upper(int value);
}
}
#endif // NUM_CVT_H

View File

@ -0,0 +1,85 @@
#ifndef LH_OS_TYPES_H
#define LH_OS_TYPES_H
#include <string>
#include <cstdint>
namespace litehtml
{
#if defined( WIN32 ) || defined( _WIN32 ) || defined( WINCE )
// noexcept appeared since Visual Studio 2013
#if defined(_MSC_VER) && _MSC_VER < 1900
#define noexcept
#endif
#ifndef LITEHTML_UTF8
typedef std::wstring tstring;
typedef wchar_t tchar_t;
typedef std::wstringstream tstringstream;
#define _t(quote) L##quote
#define t_strlen wcslen
#define t_strcmp wcscmp
#define t_strncmp wcsncmp
#define t_strtol wcstol
#define t_atoi _wtoi
#define t_itoa(value, buffer, size, radix) _itow_s(value, buffer, size, radix)
#define t_strstr wcsstr
#define t_isspace iswspace
#define t_to_string(val) std::to_wstring(val)
#else
typedef std::string tstring;
typedef char tchar_t;
typedef std::stringstream tstringstream;
#define _t(quote) quote
#define t_strlen strlen
#define t_strcmp strcmp
#define t_strncmp strncmp
#define t_strtol strtol
#define t_atoi atoi
#define t_itoa(value, buffer, size, radix) _itoa_s(value, buffer, size, radix)
#define t_strstr strstr
#define t_isspace isspace
#define t_to_string(val) std::to_string(val)
#endif
#ifdef _WIN64
typedef unsigned __int64 uint_ptr;
#else
typedef unsigned int uint_ptr;
#endif
#else
#define LITEHTML_UTF8
typedef std::string tstring;
typedef char tchar_t;
typedef std::uintptr_t uint_ptr;
typedef std::stringstream tstringstream;
#define _t(quote) quote
#define t_strlen strlen
#define t_strcmp strcmp
#define t_strncmp strncmp
#define t_itoa(value, buffer, size, radix) snprintf(buffer, size, "%d", value)
#define t_strtol strtol
#define t_atoi atoi
#define t_strstr strstr
#define t_isspace isspace
#define t_to_string(val) std::to_string(val)
#endif
}
#endif // LH_OS_TYPES_H

View File

@ -0,0 +1,96 @@
#ifndef LH_STYLE_H
#define LH_STYLE_H
#include "attributes.h"
#include <string>
namespace litehtml
{
class property_value
{
public:
tstring m_value;
bool m_important;
property_value()
{
m_important = false;
}
property_value(const tchar_t* val, bool imp)
{
m_important = imp;
m_value = val;
}
property_value(const property_value& val)
{
m_value = val.m_value;
m_important = val.m_important;
}
property_value& operator=(const property_value& val)
{
m_value = val.m_value;
m_important = val.m_important;
return *this;
}
};
typedef std::map<tstring, property_value> props_map;
class style
{
public:
typedef std::shared_ptr<style> ptr;
typedef std::vector<style::ptr> vector;
private:
props_map m_properties;
static string_map m_valid_values;
public:
style() = default;
style(const style& val);
style& operator=(const style& val)
{
m_properties = val.m_properties;
return *this;
}
void add(const tchar_t* txt, const tchar_t* baseurl, const element* el)
{
parse(txt, baseurl, el);
}
void add_property(const tchar_t* name, const tchar_t* val, const tchar_t* baseurl, bool important, const element* el);
const tchar_t* get_property(const tchar_t* name) const
{
if(name)
{
auto f = m_properties.find(name);
if(f != m_properties.end())
{
return f->second.m_value.c_str();
}
}
return nullptr;
}
void combine(const litehtml::style& src);
void clear()
{
m_properties.clear();
}
private:
void parse_property(const tstring& txt, const tchar_t* baseurl, const element* el);
void parse(const tchar_t* txt, const tchar_t* baseurl, const element* el);
void parse_short_border(const tstring& prefix, const tstring& val, bool important);
void parse_short_background(const tstring& val, const tchar_t* baseurl, bool important);
void parse_short_font(const tstring& val, bool important);
static void subst_vars(tstring& str, const element* el);
void add_parsed_property(const tstring& name, const tstring& val, bool important);
void remove_property(const tstring& name, bool important);
};
}
#endif // LH_STYLE_H

View File

@ -0,0 +1,47 @@
#ifndef LH_STYLESHEET_H
#define LH_STYLESHEET_H
#include "style.h"
#include "css_selector.h"
namespace litehtml
{
class document_container;
class css
{
css_selector::vector m_selectors;
public:
css() = default;
~css() = default;
const css_selector::vector& selectors() const
{
return m_selectors;
}
void clear()
{
m_selectors.clear();
}
void parse_stylesheet(const tchar_t* str, const tchar_t* baseurl, const std::shared_ptr <document>& doc, const media_query_list::ptr& media);
void sort_selectors();
static void parse_css_url(const tstring& str, tstring& url);
private:
void parse_atrule(const tstring& text, const tchar_t* baseurl, const std::shared_ptr<document>& doc, const media_query_list::ptr& media);
void add_selector(const css_selector::ptr& selector);
bool parse_selectors(const tstring& txt, const tstring& styles, const media_query_list::ptr& media, const tstring& baseurl);
};
inline void litehtml::css::add_selector( const css_selector::ptr& selector )
{
selector->m_order = (int) m_selectors.size();
m_selectors.push_back(selector);
}
}
#endif // LH_STYLESHEET_H

View File

@ -0,0 +1,251 @@
#ifndef LH_TABLE_H
#define LH_TABLE_H
namespace litehtml
{
struct table_row
{
typedef std::vector<table_row> vector;
int height;
int border_top;
int border_bottom;
element::ptr el_row;
int top;
int bottom;
css_length css_height;
int min_height;
table_row()
{
min_height = 0;
top = 0;
bottom = 0;
border_bottom = 0;
border_top = 0;
height = 0;
el_row = nullptr;
css_height.predef(0);
}
table_row(int h, element::ptr& row)
{
min_height = 0;
height = h;
el_row = row;
border_bottom = 0;
border_top = 0;
top = 0;
bottom = 0;
if (row)
{
css_height = row->get_css_height();
}
}
table_row(const table_row& val)
{
min_height = val.min_height;
top = val.top;
bottom = val.bottom;
border_bottom = val.border_bottom;
border_top = val.border_top;
height = val.height;
css_height = val.css_height;
el_row = val.el_row;
}
table_row(table_row&& val) noexcept
{
min_height = val.min_height;
top = val.top;
bottom = val.bottom;
border_bottom = val.border_bottom;
border_top = val.border_top;
height = val.height;
css_height = val.css_height;
el_row = std::move(val.el_row);
}
};
struct table_column
{
typedef std::vector<table_column> vector;
int min_width;
int max_width;
int width;
css_length css_width;
int border_left;
int border_right;
int left;
int right;
table_column()
{
left = 0;
right = 0;
border_left = 0;
border_right = 0;
min_width = 0;
max_width = 0;
width = 0;
css_width.predef(0);
}
table_column(int min_w, int max_w)
{
left = 0;
right = 0;
border_left = 0;
border_right = 0;
max_width = max_w;
min_width = min_w;
width = 0;
css_width.predef(0);
}
table_column(const table_column& val)
{
left = val.left;
right = val.right;
border_left = val.border_left;
border_right = val.border_right;
max_width = val.max_width;
min_width = val.min_width;
width = val.width;
css_width = val.css_width;
}
};
class table_column_accessor
{
public:
virtual int& get(table_column& col) = 0;
protected:
~table_column_accessor() = default;
};
class table_column_accessor_max_width final : public table_column_accessor
{
public:
int& get(table_column& col) override;
};
class table_column_accessor_min_width final : public table_column_accessor
{
public:
int& get(table_column& col) override;
};
class table_column_accessor_width final : public table_column_accessor
{
public:
int& get(table_column& col) override;
};
struct table_cell
{
element::ptr el;
int colspan;
int rowspan;
int min_width;
int min_height;
int max_width;
int max_height;
int width;
int height;
margins borders;
table_cell()
{
min_width = 0;
min_height = 0;
max_width = 0;
max_height = 0;
width = 0;
height = 0;
colspan = 1;
rowspan = 1;
el = nullptr;
}
table_cell(const table_cell& val)
{
el = val.el;
colspan = val.colspan;
rowspan = val.rowspan;
width = val.width;
height = val.height;
min_width = val.min_width;
min_height = val.min_height;
max_width = val.max_width;
max_height = val.max_height;
borders = val.borders;
}
table_cell(table_cell&& val) noexcept
{
el = std::move(val.el);
colspan = val.colspan;
rowspan = val.rowspan;
width = val.width;
height = val.height;
min_width = val.min_width;
min_height = val.min_height;
max_width = val.max_width;
max_height = val.max_height;
borders = val.borders;
}
};
class table_grid
{
public:
typedef std::vector< std::vector<table_cell> > rows;
private:
int m_rows_count;
int m_cols_count;
rows m_cells;
table_column::vector m_columns;
table_row::vector m_rows;
elements_vector m_captions;
int m_captions_height;
public:
table_grid()
{
m_rows_count = 0;
m_cols_count = 0;
m_captions_height = 0;
}
void clear();
void begin_row(element::ptr& row);
void add_cell(element::ptr& el);
bool is_rowspanned(int r, int c);
void finish();
table_cell* cell(int t_col, int t_row);
table_column& column(int c) { return m_columns[c]; }
table_row& row(int r) { return m_rows[r]; }
elements_vector& captions() { return m_captions; }
int rows_count() const { return m_rows_count; }
int cols_count() const { return m_cols_count; }
void captions_height(int height) { m_captions_height = height; }
int captions_height() const { return m_captions_height; }
void distribute_max_width(int width, int start, int end);
void distribute_min_width(int width, int start, int end);
void distribute_width(int width, int start, int end);
void distribute_width(int width, int start, int end, table_column_accessor* acc);
int calc_table_width(int block_width, bool is_auto, int& min_table_width, int& max_table_width);
void calc_horizontal_positions(margins& table_borders, border_collapse bc, int bdr_space_x);
void calc_vertical_positions(margins& table_borders, border_collapse bc, int bdr_space_y);
void calc_rows_height(int blockHeight, int borderSpacingY);
};
}
#endif // LH_TABLE_H

View File

@ -0,0 +1,136 @@
// Copyright (C) 2020-2021 Primate Labs Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the names of the copyright holders nor the names of their
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#ifndef LITEHTML_TSTRING_VIEW_H__
#define LITEHTML_TSTRING_VIEW_H__
#include <cstddef>
#include <ostream>
#include "litehtml/os_types.h"
namespace litehtml {
// tstring_view is a string reference type that provides a view into a string
// that is owned elsewhere (e.g., by a std::string object).
// tstring_view implements the same interface as std::base_string_view in the
// standard library. When litehtml moves to C++17 consider replacing the
// tstring_view implementation with the standard library implementations
// (e.g., via a using statement).
class tstring_view {
public:
using value_type = tchar_t;
using pointer = tchar_t*;
using const_pointer = const tchar_t*;
using reference = tchar_t&;
using const_reference = const tchar_t&;
using iterator = const_pointer;
using const_iterator = const_pointer;
using size_type = size_t;
using difference_type = std::ptrdiff_t;
public:
tstring_view() = default;
tstring_view(const tstring_view& other) = default;
tstring_view(const_pointer s, size_type size)
: data_(s)
, size_(size)
{
}
constexpr const_iterator begin() const
{
return data_;
}
constexpr const_iterator cbegin() const
{
return data_;
}
constexpr const_iterator end() const
{
return data_ + size_;
}
constexpr const_iterator cend() const
{
return data_ + size_;
}
constexpr const_reference operator[](size_type offset) const
{
return *(data_ + offset);
}
constexpr const_pointer data() const
{
return data_;
}
size_type size() const
{
return size_;
}
size_type length() const
{
return size_;
}
bool empty() const
{
return (size_ == 0);
}
private:
const_pointer data_ = nullptr;
size_type size_ = 0;
};
std::basic_ostream<tstring_view::value_type>& operator<<(
std::basic_ostream<tstring_view::value_type>&,
tstring_view str);
} // namespace litehtml
#endif // LITEHTML_TSTRING_VIEW_H__

View File

@ -0,0 +1,750 @@
#ifndef LH_TYPES_H
#define LH_TYPES_H
#include <stdlib.h>
#include <memory>
#include <map>
#include <vector>
namespace litehtml
{
class document;
class element;
typedef std::map<litehtml::tstring, litehtml::tstring> string_map;
typedef std::vector< std::shared_ptr<litehtml::element> > elements_vector;
typedef std::vector<int> int_vector;
typedef std::vector<litehtml::tstring> string_vector;
const unsigned int font_decoration_none = 0x00;
const unsigned int font_decoration_underline = 0x01;
const unsigned int font_decoration_linethrough = 0x02;
const unsigned int font_decoration_overline = 0x04;
typedef unsigned char byte;
typedef unsigned int ucode_t;
struct margins
{
int left;
int right;
int top;
int bottom;
margins()
{
left = right = top = bottom = 0;
}
int width() const { return left + right; }
int height() const { return top + bottom; }
};
struct size
{
int width;
int height;
size()
{
width = 0;
height = 0;
}
};
struct position
{
typedef std::vector<position> vector;
int x;
int y;
int width;
int height;
position()
{
x = y = width = height = 0;
}
position(int x, int y, int width, int height)
{
this->x = x;
this->y = y;
this->width = width;
this->height = height;
}
int right() const { return x + width; }
int bottom() const { return y + height; }
int left() const { return x; }
int top() const { return y; }
void operator+=(const margins& mg)
{
x -= mg.left;
y -= mg.top;
width += mg.left + mg.right;
height += mg.top + mg.bottom;
}
void operator-=(const margins& mg)
{
x += mg.left;
y += mg.top;
width -= mg.left + mg.right;
height -= mg.top + mg.bottom;
}
void clear()
{
x = y = width = height = 0;
}
void operator=(const size& sz)
{
width = sz.width;
height = sz.height;
}
void move_to(int x, int y)
{
this->x = x;
this->y = y;
}
bool does_intersect(const position* val) const
{
if(!val) return true;
return (
left() <= val->right() &&
right() >= val->left() &&
bottom() >= val->top() &&
top() <= val->bottom() )
|| (
val->left() <= right() &&
val->right() >= left() &&
val->bottom() >= top() &&
val->top() <= bottom() );
}
bool empty() const
{
if(!width && !height)
{
return true;
}
return false;
}
bool is_point_inside(int x, int y) const
{
if(x >= left() && x <= right() && y >= top() && y <= bottom())
{
return true;
}
return false;
}
};
struct font_metrics
{
int height;
int ascent;
int descent;
int x_height;
bool draw_spaces;
font_metrics()
{
height = 0;
ascent = 0;
descent = 0;
x_height = 0;
draw_spaces = true;
}
int base_line() { return descent; }
};
struct font_item
{
uint_ptr font;
font_metrics metrics;
};
typedef std::map<tstring, font_item> fonts_map;
enum draw_flag
{
draw_root,
draw_block,
draw_floats,
draw_inlines,
draw_positioned,
};
#define style_display_strings _t("none;block;inline;inline-block;inline-table;list-item;table;table-caption;table-cell;table-column;table-column-group;table-footer-group;table-header-group;table-row;table-row-group;inline-text")
enum style_display
{
display_none,
display_block,
display_inline,
display_inline_block,
display_inline_table,
display_list_item,
display_table,
display_table_caption,
display_table_cell,
display_table_column,
display_table_column_group,
display_table_footer_group,
display_table_header_group,
display_table_row,
display_table_row_group,
display_inline_text,
};
enum style_border
{
borderNope,
borderNone,
borderHidden,
borderDotted,
borderDashed,
borderSolid,
borderDouble
};
#define font_size_strings _t("xx-small;x-small;small;medium;large;x-large;xx-large;smaller;larger")
enum font_size
{
fontSize_xx_small,
fontSize_x_small,
fontSize_small,
fontSize_medium,
fontSize_large,
fontSize_x_large,
fontSize_xx_large,
fontSize_smaller,
fontSize_larger,
};
#define font_style_strings _t("normal;italic")
enum font_style
{
fontStyleNormal,
fontStyleItalic
};
#define font_variant_strings _t("normal;small-caps")
enum font_variant
{
font_variant_normal,
font_variant_italic
};
#define font_weight_strings _t("normal;bold;bolder;lighter;100;200;300;400;500;600;700")
enum font_weight
{
fontWeightNormal,
fontWeightBold,
fontWeightBolder,
fontWeightLighter,
fontWeight100,
fontWeight200,
fontWeight300,
fontWeight400,
fontWeight500,
fontWeight600,
fontWeight700
};
#define list_style_type_strings _t("none;circle;disc;square;armenian;cjk-ideographic;decimal;decimal-leading-zero;georgian;hebrew;hiragana;hiragana-iroha;katakana;katakana-iroha;lower-alpha;lower-greek;lower-latin;lower-roman;upper-alpha;upper-latin;upper-roman")
enum list_style_type
{
list_style_type_none,
list_style_type_circle,
list_style_type_disc,
list_style_type_square,
list_style_type_armenian,
list_style_type_cjk_ideographic,
list_style_type_decimal,
list_style_type_decimal_leading_zero,
list_style_type_georgian,
list_style_type_hebrew,
list_style_type_hiragana,
list_style_type_hiragana_iroha,
list_style_type_katakana,
list_style_type_katakana_iroha,
list_style_type_lower_alpha,
list_style_type_lower_greek,
list_style_type_lower_latin,
list_style_type_lower_roman,
list_style_type_upper_alpha,
list_style_type_upper_latin,
list_style_type_upper_roman,
};
#define list_style_position_strings _t("inside;outside")
enum list_style_position
{
list_style_position_inside,
list_style_position_outside
};
#define vertical_align_strings _t("baseline;sub;super;top;text-top;middle;bottom;text-bottom")
enum vertical_align
{
va_baseline,
va_sub,
va_super,
va_top,
va_text_top,
va_middle,
va_bottom,
va_text_bottom
};
#define border_width_strings _t("thin;medium;thick")
enum border_width
{
border_width_thin,
border_width_medium,
border_width_thick
};
#define border_style_strings _t("none;hidden;dotted;dashed;solid;double;groove;ridge;inset;outset")
enum border_style
{
border_style_none,
border_style_hidden,
border_style_dotted,
border_style_dashed,
border_style_solid,
border_style_double,
border_style_groove,
border_style_ridge,
border_style_inset,
border_style_outset
};
#define element_float_strings _t("none;left;right")
enum element_float
{
float_none,
float_left,
float_right
};
#define element_clear_strings _t("none;left;right;both")
enum element_clear
{
clear_none,
clear_left,
clear_right,
clear_both
};
#define css_units_strings _t("none;%;in;cm;mm;em;ex;pt;pc;px;dpi;dpcm;vw;vh;vmin;vmax;rem")
enum css_units
{
css_units_none,
css_units_percentage,
css_units_in,
css_units_cm,
css_units_mm,
css_units_em,
css_units_ex,
css_units_pt,
css_units_pc,
css_units_px,
css_units_dpi,
css_units_dpcm,
css_units_vw,
css_units_vh,
css_units_vmin,
css_units_vmax,
css_units_rem,
};
#define background_attachment_strings _t("scroll;fixed")
enum background_attachment
{
background_attachment_scroll,
background_attachment_fixed
};
#define background_repeat_strings _t("repeat;repeat-x;repeat-y;no-repeat")
enum background_repeat
{
background_repeat_repeat,
background_repeat_repeat_x,
background_repeat_repeat_y,
background_repeat_no_repeat
};
#define background_box_strings _t("border-box;padding-box;content-box")
enum background_box
{
background_box_border,
background_box_padding,
background_box_content
};
#define element_position_strings _t("static;relative;absolute;fixed")
enum element_position
{
element_position_static,
element_position_relative,
element_position_absolute,
element_position_fixed,
};
#define text_align_strings _t("left;right;center;justify")
enum text_align
{
text_align_left,
text_align_right,
text_align_center,
text_align_justify
};
#define text_transform_strings _t("none;capitalize;uppercase;lowercase")
enum text_transform
{
text_transform_none,
text_transform_capitalize,
text_transform_uppercase,
text_transform_lowercase
};
#define white_space_strings _t("normal;nowrap;pre;pre-line;pre-wrap")
enum white_space
{
white_space_normal,
white_space_nowrap,
white_space_pre,
white_space_pre_line,
white_space_pre_wrap
};
#define overflow_strings _t("visible;hidden;scroll;auto;no-display;no-content")
enum overflow
{
overflow_visible,
overflow_hidden,
overflow_scroll,
overflow_auto,
overflow_no_display,
overflow_no_content
};
#define background_size_strings _t("auto;cover;contain")
enum background_size
{
background_size_auto,
background_size_cover,
background_size_contain,
};
#define visibility_strings _t("visible;hidden;collapse")
enum visibility
{
visibility_visible,
visibility_hidden,
visibility_collapse,
};
#define border_collapse_strings _t("collapse;separate")
enum border_collapse
{
border_collapse_collapse,
border_collapse_separate,
};
#define pseudo_class_strings _t("only-child;only-of-type;first-child;first-of-type;last-child;last-of-type;nth-child;nth-of-type;nth-last-child;nth-last-of-type;not;lang")
enum pseudo_class
{
pseudo_class_only_child,
pseudo_class_only_of_type,
pseudo_class_first_child,
pseudo_class_first_of_type,
pseudo_class_last_child,
pseudo_class_last_of_type,
pseudo_class_nth_child,
pseudo_class_nth_of_type,
pseudo_class_nth_last_child,
pseudo_class_nth_last_of_type,
pseudo_class_not,
pseudo_class_lang,
};
#define content_property_string _t("none;normal;open-quote;close-quote;no-open-quote;no-close-quote")
enum content_property
{
content_property_none,
content_property_normal,
content_property_open_quote,
content_property_close_quote,
content_property_no_open_quote,
content_property_no_close_quote,
};
struct floated_box
{
typedef std::vector<floated_box> vector;
position pos;
element_float float_side;
element_clear clear_floats;
std::shared_ptr<element> el;
floated_box() = default;
floated_box(const floated_box& val)
{
pos = val.pos;
float_side = val.float_side;
clear_floats = val.clear_floats;
el = val.el;
}
floated_box& operator=(const floated_box& val)
{
pos = val.pos;
float_side = val.float_side;
clear_floats = val.clear_floats;
el = val.el;
return *this;
}
floated_box(floated_box&& val)
{
pos = val.pos;
float_side = val.float_side;
clear_floats = val.clear_floats;
el = std::move(val.el);
}
void operator=(floated_box&& val)
{
pos = val.pos;
float_side = val.float_side;
clear_floats = val.clear_floats;
el = std::move(val.el);
}
};
struct int_int_cache
{
int hash;
int val;
bool is_valid;
bool is_default;
int_int_cache()
{
hash = 0;
val = 0;
is_valid = false;
is_default = false;
}
void invalidate()
{
is_valid = false;
is_default = false;
}
void set_value(int vHash, int vVal)
{
hash = vHash;
val = vVal;
is_valid = true;
}
};
enum select_result
{
select_no_match = 0x00,
select_match = 0x01,
select_match_pseudo_class = 0x02,
select_match_with_before = 0x10,
select_match_with_after = 0x20,
};
template<class T>
class def_value
{
T m_val;
bool m_is_default;
public:
def_value(T def_val)
{
m_is_default = true;
m_val = def_val;
}
void reset(T def_val)
{
m_is_default = true;
m_val = def_val;
}
bool is_default()
{
return m_is_default;
}
T operator=(T new_val)
{
m_val = new_val;
m_is_default = false;
return m_val;
}
operator T()
{
return m_val;
}
};
#define media_orientation_strings _t("portrait;landscape")
enum media_orientation
{
media_orientation_portrait,
media_orientation_landscape,
};
#define media_feature_strings _t("none;width;min-width;max-width;height;min-height;max-height;device-width;min-device-width;max-device-width;device-height;min-device-height;max-device-height;orientation;aspect-ratio;min-aspect-ratio;max-aspect-ratio;device-aspect-ratio;min-device-aspect-ratio;max-device-aspect-ratio;color;min-color;max-color;color-index;min-color-index;max-color-index;monochrome;min-monochrome;max-monochrome;resolution;min-resolution;max-resolution")
enum media_feature
{
media_feature_none,
media_feature_width,
media_feature_min_width,
media_feature_max_width,
media_feature_height,
media_feature_min_height,
media_feature_max_height,
media_feature_device_width,
media_feature_min_device_width,
media_feature_max_device_width,
media_feature_device_height,
media_feature_min_device_height,
media_feature_max_device_height,
media_feature_orientation,
media_feature_aspect_ratio,
media_feature_min_aspect_ratio,
media_feature_max_aspect_ratio,
media_feature_device_aspect_ratio,
media_feature_min_device_aspect_ratio,
media_feature_max_device_aspect_ratio,
media_feature_color,
media_feature_min_color,
media_feature_max_color,
media_feature_color_index,
media_feature_min_color_index,
media_feature_max_color_index,
media_feature_monochrome,
media_feature_min_monochrome,
media_feature_max_monochrome,
media_feature_resolution,
media_feature_min_resolution,
media_feature_max_resolution,
};
#define box_sizing_strings _t("content-box;border-box")
enum box_sizing
{
box_sizing_content_box,
box_sizing_border_box,
};
#define media_type_strings _t("none;all;screen;print;braille;embossed;handheld;projection;speech;tty;tv")
enum media_type
{
media_type_none,
media_type_all,
media_type_screen,
media_type_print,
media_type_braille,
media_type_embossed,
media_type_handheld,
media_type_projection,
media_type_speech,
media_type_tty,
media_type_tv,
};
struct media_features
{
media_type type;
int width; // (pixels) For continuous media, this is the width of the viewport including the size of a rendered scroll bar (if any). For paged media, this is the width of the page box.
int height; // (pixels) The height of the targeted display area of the output device. For continuous media, this is the height of the viewport including the size of a rendered scroll bar (if any). For paged media, this is the height of the page box.
int device_width; // (pixels) The width of the rendering surface of the output device. For continuous media, this is the width of the screen. For paged media, this is the width of the page sheet size.
int device_height; // (pixels) The height of the rendering surface of the output device. For continuous media, this is the height of the screen. For paged media, this is the height of the page sheet size.
int color; // The number of bits per color component of the output device. If the device is not a color device, the value is zero.
int color_index; // The number of entries in the color lookup table of the output device. If the device does not use a color lookup table, the value is zero.
int monochrome; // The number of bits per pixel in a monochrome frame buffer. If the device is not a monochrome device, the output device value will be 0.
int resolution; // The resolution of the output device (in DPI)
media_features()
{
type = media_type::media_type_none,
width =0;
height = 0;
device_width = 0;
device_height = 0;
color = 0;
color_index = 0;
monochrome = 0;
resolution = 0;
}
};
enum render_type
{
render_all,
render_no_fixed,
render_fixed_only,
};
// List of the Void Elements (can't have any contents)
const litehtml::tchar_t* const void_elements = _t("area;base;br;col;command;embed;hr;img;input;keygen;link;meta;param;source;track;wbr");
}
#endif // LH_TYPES_H

View File

@ -0,0 +1,139 @@
// Copyright (C) 2020-2021 Primate Labs Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the names of the copyright holders nor the names of their
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#ifndef LITEHTML_URL_H__
#define LITEHTML_URL_H__
#include <ostream>
#include "litehtml/os_types.h"
// https://datatracker.ietf.org/doc/html/rfc3986
namespace litehtml {
class url {
public:
url() = default;
explicit url(const tstring& str);
url(const tstring& scheme,
const tstring& authority,
const tstring& path,
const tstring& query,
const tstring& fragment);
const tstring& string() const
{
return str_;
}
const tstring& scheme() const
{
return scheme_;
}
bool has_scheme() const
{
return !scheme_.empty();
}
const tstring& authority() const
{
return authority_;
}
bool has_authority() const
{
return !authority_.empty();
}
const tstring& path() const
{
return path_;
}
bool has_path() const
{
return !path_.empty();
}
const tstring& query() const
{
return query_;
}
bool has_query() const
{
return !query_.empty();
}
const tstring& fragment() const
{
return fragment_;
}
bool has_fragment() const
{
return !fragment_.empty();
}
protected:
tstring str_;
// Assume URLs are relative by default. See RFC 3986 Section 4.3 for
// information on which URLs are considered relative and which URLs are
// considered absolute:
//
// https://datatracker.ietf.org/doc/html/rfc3986#section-4.3
bool absolute_ = false;
tstring scheme_;
tstring authority_;
tstring path_;
tstring query_;
tstring fragment_;
};
// Returns a URL that is resolved from the reference URL that might be
// relative to the base URL. For example, given <https://www.twitter.com/> as
// the base URL and </foo> as the relative URL, resolve() will return the URL
// <https://www.twitter.com/foo>.
url resolve(const url& base, const url& reference);
} // namespace litehtml
#endif // LITEHTML_URL_H__

View File

@ -0,0 +1,51 @@
// Copyright (C) 2020-2021 Primate Labs Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the names of the copyright holders nor the names of their
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#ifndef LITEHTML_URL_PATH_H__
#define LITEHTML_URL_PATH_H__
#include <ostream>
#include "litehtml/os_types.h"
namespace litehtml {
bool is_url_path_absolute(const tstring& path);
tstring url_path_directory_name(const tstring& path);
tstring url_path_base_name(const tstring& path);
tstring url_path_append(const tstring& base, const tstring& path);
tstring url_path_resolve(const tstring& base, const tstring& path);
} // namespace litehtml
#endif // LITEHTML_URL_PATH_H__

View File

@ -0,0 +1,59 @@
#ifndef LH_UTF8_STRINGS_H
#define LH_UTF8_STRINGS_H
#include "os_types.h"
#include "types.h"
namespace litehtml
{
class utf8_to_wchar
{
const byte* m_utf8;
std::wstring m_str;
public:
utf8_to_wchar(const char* val);
operator const wchar_t*() const
{
return m_str.c_str();
}
private:
ucode_t getb()
{
if (!(*m_utf8)) return 0;
return *m_utf8++;
}
ucode_t get_next_utf8(ucode_t val)
{
return (val & 0x3f);
}
ucode_t get_char();
};
class wchar_to_utf8
{
std::string m_str;
public:
wchar_to_utf8(const std::wstring& val);
operator const char*() const
{
return m_str.c_str();
}
const char* c_str() const
{
return m_str.c_str();
}
};
#ifdef LITEHTML_UTF8
#define litehtml_from_utf8(str) str
#define litehtml_to_utf8(str) str
#define litehtml_from_wchar(str) litehtml::wchar_to_utf8(str)
#else
#define litehtml_from_utf8(str) litehtml::utf8_to_wchar(str)
#define litehtml_from_wchar(str) str
#define litehtml_to_utf8(str) litehtml::wchar_to_utf8(str)
#endif
}
#endif // LH_UTF8_STRINGS_H

View File

@ -0,0 +1,61 @@
#ifndef LH_WEB_COLOR_H
#define LH_WEB_COLOR_H
namespace litehtml
{
struct def_color
{
const tchar_t* name;
const tchar_t* rgb;
};
extern def_color g_def_colors[];
class document_container;
struct web_color
{
byte blue;
byte green;
byte red;
byte alpha;
web_color(byte r, byte g, byte b, byte a = 255)
{
blue = b;
green = g;
red = r;
alpha = a;
}
web_color()
{
blue = 0;
green = 0;
red = 0;
alpha = 0xFF;
}
web_color(const web_color& val)
{
blue = val.blue;
green = val.green;
red = val.red;
alpha = val.alpha;
}
web_color& operator=(const web_color& val)
{
blue = val.blue;
green = val.green;
red = val.red;
alpha = val.alpha;
return *this;
}
static web_color from_string(const tchar_t* str, litehtml::document_container* callback);
static litehtml::tstring resolve_name(const tchar_t* name, litehtml::document_container* callback);
static bool is_color(const tchar_t* str);
};
}
#endif // LH_WEB_COLOR_H

View File

@ -0,0 +1,327 @@
html {
display: block;
height:100%;
width:100%;
position: relative;
}
head {
display: none
}
meta {
display: none
}
title {
display: none
}
link {
display: none
}
style {
display: none
}
script {
display: none
}
body {
display:block;
margin:8px;
height:100%;
width:100%;
}
p {
display:block;
margin-top:1em;
margin-bottom:1em;
}
b, strong {
display:inline;
font-weight:bold;
}
i, em {
display:inline;
font-style:italic;
}
center
{
text-align:center;
display:block;
}
a:link
{
text-decoration: underline;
color: #00f;
cursor: pointer;
}
h1, h2, h3, h4, h5, h6, div {
display:block;
}
h1 {
font-weight:bold;
margin-top:0.67em;
margin-bottom:0.67em;
font-size: 2em;
}
h2 {
font-weight:bold;
margin-top:0.83em;
margin-bottom:0.83em;
font-size: 1.5em;
}
h3 {
font-weight:bold;
margin-top:1em;
margin-bottom:1em;
font-size:1.17em;
}
h4 {
font-weight:bold;
margin-top:1.33em;
margin-bottom:1.33em
}
h5 {
font-weight:bold;
margin-top:1.67em;
margin-bottom:1.67em;
font-size:.83em;
}
h6 {
font-weight:bold;
margin-top:2.33em;
margin-bottom:2.33em;
font-size:.67em;
}
br {
display:inline-block;
}
br[clear="all"]
{
clear:both;
}
br[clear="left"]
{
clear:left;
}
br[clear="right"]
{
clear:right;
}
span {
display:inline
}
img {
display: inline-block;
}
img[align="right"]
{
float: right;
}
img[align="left"]
{
float: left;
}
hr {
display: block;
margin-top: 0.5em;
margin-bottom: 0.5em;
margin-left: auto;
margin-right: auto;
border-style: inset;
border-width: 1px
}
/***************** TABLES ********************/
table {
display: table;
border-collapse: separate;
border-spacing: 2px;
border-top-color:gray;
border-left-color:gray;
border-bottom-color:black;
border-right-color:black;
}
tbody, tfoot, thead {
display:table-row-group;
vertical-align:middle;
}
tr {
display: table-row;
vertical-align: inherit;
border-color: inherit;
}
td, th {
display: table-cell;
vertical-align: inherit;
border-width:1px;
padding:1px;
}
th {
font-weight: bold;
}
table[border] {
border-style:solid;
}
table[border|=0] {
border-style:none;
}
table[border] td, table[border] th {
border-style:solid;
border-top-color:black;
border-left-color:black;
border-bottom-color:gray;
border-right-color:gray;
}
table[border|=0] td, table[border|=0] th {
border-style:none;
}
caption {
display: table-caption;
}
td[nowrap], th[nowrap] {
white-space:nowrap;
}
tt, code, kbd, samp {
font-family: monospace
}
pre, xmp, plaintext, listing {
display: block;
font-family: monospace;
white-space: pre;
margin: 1em 0
}
/***************** LISTS ********************/
ul, menu, dir {
display: block;
list-style-type: disc;
margin-top: 1em;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
padding-left: 40px
}
ol {
display: block;
list-style-type: decimal;
margin-top: 1em;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
padding-left: 40px
}
li {
display: list-item;
}
ul ul, ol ul {
list-style-type: circle;
}
ol ol ul, ol ul ul, ul ol ul, ul ul ul {
list-style-type: square;
}
dd {
display: block;
margin-left: 40px;
}
dl {
display: block;
margin-top: 1em;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
}
dt {
display: block;
}
ol ul, ul ol, ul ul, ol ol {
margin-top: 0;
margin-bottom: 0
}
blockquote {
display: block;
margin-top: 1em;
margin-bottom: 1em;
margin-left: 40px;
margin-right: 40px;
}
/*********** FORM ELEMENTS ************/
form {
display: block;
margin-top: 0em;
}
option {
display: none;
}
input, textarea, keygen, select, button, isindex {
margin: 0em;
color: initial;
line-height: normal;
text-transform: none;
text-indent: 0;
text-shadow: none;
display: inline-block;
}
input[type="hidden"] {
display: none;
}
article, aside, footer, header, hgroup, nav, section
{
display: block;
}

View File

@ -0,0 +1,76 @@
#include "html.h"
#include "background.h"
litehtml::background::background()
{
m_attachment = background_attachment_scroll;
m_repeat = background_repeat_repeat;
m_clip = background_box_border;
m_origin = background_box_padding;
m_color.alpha = 0;
m_color.red = 0;
m_color.green = 0;
m_color.blue = 0;
}
litehtml::background::background( const background& val )
{
m_image = val.m_image;
m_baseurl = val.m_baseurl;
m_color = val.m_color;
m_attachment = val.m_attachment;
m_position = val.m_position;
m_repeat = val.m_repeat;
m_clip = val.m_clip;
m_origin = val.m_origin;
}
litehtml::background& litehtml::background::operator=( const background& val )
{
m_image = val.m_image;
m_baseurl = val.m_baseurl;
m_color = val.m_color;
m_attachment = val.m_attachment;
m_position = val.m_position;
m_repeat = val.m_repeat;
m_clip = val.m_clip;
m_origin = val.m_origin;
return *this;
}
litehtml::background_paint::background_paint() : color(0, 0, 0, 0)
{
position_x = 0;
position_y = 0;
attachment = background_attachment_scroll;
repeat = background_repeat_repeat;
is_root = false;
}
litehtml::background_paint::background_paint( const background_paint& val )
{
image = val.image;
baseurl = val.baseurl;
attachment = val.attachment;
repeat = val.repeat;
color = val.color;
clip_box = val.clip_box;
origin_box = val.origin_box;
border_box = val.border_box;
border_radius = val.border_radius;
image_size = val.image_size;
position_x = val.position_x;
position_y = val.position_y;
is_root = val.is_root;
}
litehtml::background_paint& litehtml::background_paint::operator=( const background& val )
{
attachment = val.m_attachment;
baseurl = val.m_baseurl;
image = val.m_image;
repeat = val.m_repeat;
color = val.m_color;
return *this;
}

View File

@ -0,0 +1,432 @@
#include "html.h"
#include "box.h"
#include "html_tag.h"
litehtml::box_type litehtml::block_box::get_type() const
{
return box_block;
}
int litehtml::block_box::height() const
{
return m_element->height();
}
int litehtml::block_box::width() const
{
return m_element->width();
}
void litehtml::block_box::add_element(const element::ptr &el)
{
m_element = el;
el->m_box = this;
}
void litehtml::block_box::finish(bool last_box)
{
if(!m_element) return;
m_element->apply_relative_shift(m_box_right - m_box_left);
}
bool litehtml::block_box::can_hold(const element::ptr &el, white_space ws) const
{
if(m_element || el->is_inline_box())
{
return false;
}
return true;
}
bool litehtml::block_box::is_empty() const
{
if(m_element)
{
return false;
}
return true;
}
int litehtml::block_box::baseline() const
{
if(m_element)
{
return m_element->get_base_line();
}
return 0;
}
void litehtml::block_box::get_elements( elements_vector& els )
{
els.push_back(m_element);
}
int litehtml::block_box::top_margin() const
{
if(m_element && m_element->collapse_top_margin())
{
return m_element->m_margins.top;
}
return 0;
}
int litehtml::block_box::bottom_margin() const
{
if(m_element && m_element->collapse_bottom_margin())
{
return m_element->m_margins.bottom;
}
return 0;
}
void litehtml::block_box::y_shift( int shift )
{
m_box_top += shift;
if(m_element)
{
m_element->m_pos.y += shift;
}
}
void litehtml::block_box::new_width( int left, int right, elements_vector& els )
{
}
//////////////////////////////////////////////////////////////////////////
litehtml::box_type litehtml::line_box::get_type() const
{
return box_line;
}
int litehtml::line_box::height() const
{
return m_height;
}
int litehtml::line_box::width() const
{
return m_width;
}
void litehtml::line_box::add_element(const element::ptr &el)
{
el->m_skip = false;
el->m_box = nullptr;
bool add = true;
if( (m_items.empty() && el->is_white_space()) || el->is_break() )
{
el->m_skip = true;
} else if(el->is_white_space())
{
if (have_last_space())
{
add = false;
el->m_skip = true;
}
}
if(add)
{
el->m_box = this;
m_items.push_back(el);
if(!el->m_skip)
{
int el_shift_left = el->get_inline_shift_left();
int el_shift_right = el->get_inline_shift_right();
el->m_pos.x = m_box_left + m_width + el_shift_left + el->content_margins_left();
el->m_pos.y = m_box_top + el->content_margins_top();
m_width += el->width() + el_shift_left + el_shift_right;
}
}
}
void litehtml::line_box::finish(bool last_box)
{
if( is_empty() || (!is_empty() && last_box && is_break_only()) )
{
m_height = 0;
return;
}
for(auto i = m_items.rbegin(); i != m_items.rend(); i++)
{
if((*i)->is_white_space() || (*i)->is_break())
{
if(!(*i)->m_skip)
{
(*i)->m_skip = true;
m_width -= (*i)->width();
}
} else
{
break;
}
}
int base_line = m_font_metrics.base_line();
int line_height = m_line_height;
int add_x = 0;
switch(m_text_align)
{
case text_align_right:
if(m_width < (m_box_right - m_box_left))
{
add_x = (m_box_right - m_box_left) - m_width;
}
break;
case text_align_center:
if(m_width < (m_box_right - m_box_left))
{
add_x = ((m_box_right - m_box_left) - m_width) / 2;
}
break;
default:
add_x = 0;
}
m_height = 0;
// find line box baseline and line-height
for(const auto& el : m_items)
{
if(el->get_display() == display_inline_text)
{
font_metrics fm;
el->get_font(&fm);
base_line = std::max(base_line, fm.base_line());
line_height = std::max(line_height, el->line_height());
m_height = std::max(m_height, fm.height);
}
el->m_pos.x += add_x;
}
if(m_height)
{
base_line += (line_height - m_height) / 2;
}
m_height = line_height;
int y1 = 0;
int y2 = m_height;
for (const auto& el : m_items)
{
if(el->get_display() == display_inline_text)
{
font_metrics fm;
el->get_font(&fm);
el->m_pos.y = m_height - base_line - fm.ascent;
} else
{
switch(el->get_vertical_align())
{
case va_super:
case va_sub:
case va_baseline:
el->m_pos.y = m_height - base_line - el->height() + el->get_base_line() + el->content_margins_top();
break;
case va_top:
el->m_pos.y = y1 + el->content_margins_top();
break;
case va_text_top:
el->m_pos.y = m_height - base_line - m_font_metrics.ascent + el->content_margins_top();
break;
case va_middle:
el->m_pos.y = m_height - base_line - m_font_metrics.x_height / 2 - el->height() / 2 + el->content_margins_top();
break;
case va_bottom:
el->m_pos.y = y2 - el->height() + el->content_margins_top();
break;
case va_text_bottom:
el->m_pos.y = m_height - base_line + m_font_metrics.descent - el->height() + el->content_margins_top();
break;
}
y1 = std::min(y1, el->top());
y2 = std::max(y2, el->bottom());
}
}
for (const auto& el : m_items)
{
el->m_pos.y -= y1;
el->m_pos.y += m_box_top;
if(el->get_display() != display_inline_text)
{
switch(el->get_vertical_align())
{
case va_top:
el->m_pos.y = m_box_top + el->content_margins_top();
break;
case va_bottom:
el->m_pos.y = m_box_top + (y2 - y1) - el->height() + el->content_margins_top();
break;
case va_baseline:
//TODO: process vertical align "baseline"
break;
case va_middle:
//TODO: process vertical align "middle"
break;
case va_sub:
//TODO: process vertical align "sub"
break;
case va_super:
//TODO: process vertical align "super"
break;
case va_text_bottom:
//TODO: process vertical align "text-bottom"
break;
case va_text_top:
//TODO: process vertical align "text-top"
break;
}
}
el->apply_relative_shift(m_box_right - m_box_left);
}
m_height = y2 - y1;
m_baseline = (base_line - y1) - (m_height - line_height);
}
bool litehtml::line_box::can_hold(const element::ptr &el, white_space ws) const
{
if(!el->is_inline_box()) return false;
if(el->is_break())
{
return false;
}
if(ws == white_space_nowrap || ws == white_space_pre || (ws == white_space_pre_wrap && el->is_space()))
{
return true;
}
if(m_box_left + m_width + el->width() + el->get_inline_shift_left() + el->get_inline_shift_right() > m_box_right)
{
return false;
}
return true;
}
bool litehtml::line_box::have_last_space() const
{
bool ret = false;
for (auto i = m_items.rbegin(); i != m_items.rend() && !ret; i++)
{
if((*i)->is_white_space() || (*i)->is_break())
{
ret = true;
} else
{
break;
}
}
return ret;
}
bool litehtml::line_box::is_empty() const
{
if(m_items.empty()) return true;
for (auto i = m_items.rbegin(); i != m_items.rend(); i++)
{
if(!(*i)->m_skip || (*i)->is_break())
{
return false;
}
}
return true;
}
int litehtml::line_box::baseline() const
{
return m_baseline;
}
void litehtml::line_box::get_elements( elements_vector& els )
{
els.insert(els.begin(), m_items.begin(), m_items.end());
}
int litehtml::line_box::top_margin() const
{
return 0;
}
int litehtml::line_box::bottom_margin() const
{
return 0;
}
void litehtml::line_box::y_shift( int shift )
{
m_box_top += shift;
for (auto& el : m_items)
{
el->m_pos.y += shift;
}
}
bool litehtml::line_box::is_break_only() const
{
if(m_items.empty()) return true;
if(m_items.front()->is_break())
{
for (auto& el : m_items)
{
if(!el->m_skip)
{
return false;
}
}
return true;
}
return false;
}
void litehtml::line_box::new_width( int left, int right, elements_vector& els )
{
int add = left - m_box_left;
if(add)
{
m_box_left = left;
m_box_right = right;
m_width = 0;
auto remove_begin = m_items.end();
for (auto i = m_items.begin() + 1; i != m_items.end(); i++)
{
element::ptr el = (*i);
if(!el->m_skip)
{
if(m_box_left + m_width + el->width() + el->get_inline_shift_right() + el->get_inline_shift_left() > m_box_right)
{
remove_begin = i;
break;
} else
{
el->m_pos.x += add;
m_width += el->width() + el->get_inline_shift_right() + el->get_inline_shift_left();
}
}
}
if(remove_begin != m_items.end())
{
els.insert(els.begin(), remove_begin, m_items.end());
m_items.erase(remove_begin, m_items.end());
for(const auto& el : els)
{
el->m_box = nullptr;
}
}
}
}

View File

@ -0,0 +1,82 @@
// Copyright (C) 2020-2021 Primate Labs Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the names of the copyright holders nor the names of their
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include "litehtml/codepoint.h"
#include <iostream>
namespace {
bool lookup(const uint32_t* table, litehtml::tchar_t c)
{
return table[c >> 5] & (1 << (c & 0x1f));
}
} // namespace
namespace litehtml {
bool is_ascii_codepoint(litehtml::tchar_t c)
{
return (c < 128);
}
// https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
bool is_url_reserved_codepoint(litehtml::tchar_t c)
{
static const uint32_t reserved_lookup[] = {
0x00000000,
0xac009fda,
0x28000001,
0x00000000
};
if (!is_ascii_codepoint(c)) {
return false;
}
return lookup(reserved_lookup, c);
}
// https://datatracker.ietf.org/doc/html/rfc3986#section-3.1
bool is_url_scheme_codepoint(litehtml::tchar_t c)
{
static const uint32_t scheme_lookup[] = {
0x00000000,
0x03ff6800,
0x07fffffe,
0x07fffffe,
};
if (!is_ascii_codepoint(c)) {
return false;
}
return lookup(scheme_lookup, c);
}
} // namespace litehtml

View File

@ -0,0 +1,12 @@
#include "html.h"
#include "context.h"
#include "stylesheet.h"
void litehtml::context::load_master_stylesheet( const tchar_t* str )
{
media_query_list::ptr media;
m_master_css.parse_stylesheet(str, nullptr, std::shared_ptr<litehtml::document>(), media_query_list::ptr());
m_master_css.sort_selectors();
}

View File

@ -0,0 +1,54 @@
#include "html.h"
#include "css_length.h"
void litehtml::css_length::fromString( const tstring& str, const tstring& predefs, int defValue )
{
// TODO: Make support for calc
if(str.substr(0, 4) == _t("calc"))
{
m_is_predefined = true;
m_predef = 0;
return;
}
int predef = value_index(str, predefs, -1);
if(predef >= 0)
{
m_is_predefined = true;
m_predef = predef;
} else
{
m_is_predefined = false;
tstring num;
tstring un;
bool is_unit = false;
for(tchar_t chr : str)
{
if(!is_unit)
{
if(t_isdigit(chr) || chr == _t('.') || chr == _t('+') || chr == _t('-'))
{
num += chr;
} else
{
is_unit = true;
}
}
if(is_unit)
{
un += chr;
}
}
if(!num.empty())
{
m_value = (float) t_strtod(num.c_str(), nullptr);
m_units = (css_units) value_index(un, css_units_strings, css_units_none);
} else
{
// not a number so it is predefined
m_is_predefined = true;
m_predef = defValue;
}
}
}

View File

@ -0,0 +1,263 @@
#include "html.h"
#include "css_selector.h"
#include "document.h"
void litehtml::css_element_selector::parse( const tstring& txt )
{
tstring::size_type el_end = txt.find_first_of(_t(".#[:"));
m_tag = txt.substr(0, el_end);
litehtml::lcase(m_tag);
m_attrs.clear();
while(el_end != tstring::npos)
{
if(txt[el_end] == _t('.'))
{
css_attribute_selector attribute;
tstring::size_type pos = txt.find_first_of(_t(".#[:"), el_end + 1);
attribute.val = txt.substr(el_end + 1, pos - el_end - 1);
split_string( attribute.val, attribute.class_val, _t(" ") );
attribute.condition = select_equal;
attribute.attribute = _t("class");
m_attrs.push_back(attribute);
el_end = pos;
} else if(txt[el_end] == _t(':'))
{
css_attribute_selector attribute;
if(txt[el_end + 1] == _t(':'))
{
tstring::size_type pos = txt.find_first_of(_t(".#[:"), el_end + 2);
attribute.val = txt.substr(el_end + 2, pos - el_end - 2);
attribute.condition = select_pseudo_element;
litehtml::lcase(attribute.val);
attribute.attribute = _t("pseudo-el");
m_attrs.push_back(attribute);
el_end = pos;
} else
{
tstring::size_type pos = txt.find_first_of(_t(".#[:("), el_end + 1);
if(pos != tstring::npos && txt.at(pos) == _t('('))
{
pos = find_close_bracket(txt, pos);
if(pos != tstring::npos)
{
pos++;
}
}
if(pos != tstring::npos)
{
attribute.val = txt.substr(el_end + 1, pos - el_end - 1);
} else
{
attribute.val = txt.substr(el_end + 1);
}
litehtml::lcase(attribute.val);
if(attribute.val == _t("after") || attribute.val == _t("before"))
{
attribute.condition = select_pseudo_element;
} else
{
attribute.condition = select_pseudo_class;
}
attribute.attribute = _t("pseudo");
m_attrs.push_back(attribute);
el_end = pos;
}
} else if(txt[el_end] == _t('#'))
{
css_attribute_selector attribute;
tstring::size_type pos = txt.find_first_of(_t(".#[:"), el_end + 1);
attribute.val = txt.substr(el_end + 1, pos - el_end - 1);
attribute.condition = select_equal;
attribute.attribute = _t("id");
m_attrs.push_back(attribute);
el_end = pos;
} else if(txt[el_end] == _t('['))
{
css_attribute_selector attribute;
tstring::size_type pos = txt.find_first_of(_t("]~=|$*^"), el_end + 1);
tstring attr = txt.substr(el_end + 1, pos - el_end - 1);
trim(attr);
litehtml::lcase(attr);
if(pos != tstring::npos)
{
if(txt[pos] == _t(']'))
{
attribute.condition = select_exists;
} else if(txt[pos] == _t('='))
{
attribute.condition = select_equal;
pos++;
} else if(txt.substr(pos, 2) == _t("~="))
{
attribute.condition = select_contain_str;
pos += 2;
} else if(txt.substr(pos, 2) == _t("|="))
{
attribute.condition = select_start_str;
pos += 2;
} else if(txt.substr(pos, 2) == _t("^="))
{
attribute.condition = select_start_str;
pos += 2;
} else if(txt.substr(pos, 2) == _t("$="))
{
attribute.condition = select_end_str;
pos += 2;
} else if(txt.substr(pos, 2) == _t("*="))
{
attribute.condition = select_contain_str;
pos += 2;
} else
{
attribute.condition = select_exists;
pos += 1;
}
pos = txt.find_first_not_of(_t(" \t"), pos);
if(pos != tstring::npos)
{
if(txt[pos] == _t('"'))
{
tstring::size_type pos2 = txt.find_first_of(_t('\"'), pos + 1);
attribute.val = txt.substr(pos + 1, pos2 == tstring::npos ? pos2 : (pos2 - pos - 1));
pos = pos2 == tstring::npos ? pos2 : (pos2 + 1);
} else if(txt[pos] == _t(']'))
{
pos ++;
} else
{
tstring::size_type pos2 = txt.find_first_of(_t(']'), pos + 1);
attribute.val = txt.substr(pos, pos2 == tstring::npos ? pos2 : (pos2 - pos));
trim(attribute.val);
pos = pos2 == tstring::npos ? pos2 : (pos2 + 1);
}
}
} else
{
attribute.condition = select_exists;
}
attribute.attribute = attr;
m_attrs.push_back(attribute);
el_end = pos;
} else
{
el_end++;
}
el_end = txt.find_first_of(_t(".#[:"), el_end);
}
}
bool litehtml::css_selector::parse( const tstring& text )
{
if(text.empty())
{
return false;
}
string_vector tokens;
split_string(text, tokens, _t(""), _t(" \t>+~"), _t("(["));
if(tokens.empty())
{
return false;
}
tstring left;
tstring right = tokens.back();
tchar_t combinator = 0;
tokens.pop_back();
while(!tokens.empty() && (tokens.back() == _t(" ") || tokens.back() == _t("\t") || tokens.back() == _t("+") || tokens.back() == _t("~") || tokens.back() == _t(">")))
{
if(combinator == _t(' ') || combinator == 0)
{
combinator = tokens.back()[0];
}
tokens.pop_back();
}
for(const auto & token : tokens)
{
left += token;
}
trim(left);
trim(right);
if(right.empty())
{
return false;
}
m_right.parse(right);
switch(combinator)
{
case _t('>'):
m_combinator = combinator_child;
break;
case _t('+'):
m_combinator = combinator_adjacent_sibling;
break;
case _t('~'):
m_combinator = combinator_general_sibling;
break;
default:
m_combinator = combinator_descendant;
break;
}
m_left = nullptr;
if(!left.empty())
{
m_left = std::make_shared<css_selector>(media_query_list::ptr(nullptr), _t(""));
if(!m_left->parse(left))
{
return false;
}
}
return true;
}
void litehtml::css_selector::calc_specificity()
{
if(!m_right.m_tag.empty() && m_right.m_tag != _t("*"))
{
m_specificity.d = 1;
}
for(const auto& attr : m_right.m_attrs)
{
if(attr.attribute == _t("id"))
{
m_specificity.b++;
} else
{
if(attr.attribute == _t("class"))
{
m_specificity.c += (int) attr.class_val.size();
} else
{
m_specificity.c++;
}
}
}
if(m_left)
{
m_left->calc_specificity();
m_specificity += m_left->m_specificity;
}
}
void litehtml::css_selector::add_media_to_doc( document* doc ) const
{
if(m_media_query && doc)
{
doc->add_media_list(m_media_query);
}
}

View File

@ -0,0 +1,974 @@
#include "html.h"
#include "document.h"
#include "stylesheet.h"
#include "html_tag.h"
#include "el_text.h"
#include "el_para.h"
#include "el_space.h"
#include "el_body.h"
#include "el_image.h"
#include "el_table.h"
#include "el_td.h"
#include "el_link.h"
#include "el_title.h"
#include "el_style.h"
#include "el_script.h"
#include "el_comment.h"
#include "el_cdata.h"
#include "el_base.h"
#include "el_anchor.h"
#include "el_break.h"
#include "el_div.h"
#include "el_font.h"
#include "el_tr.h"
#include "el_li.h"
#include <cmath>
#include <cstdio>
#include <algorithm>
#include <functional>
#include "gumbo.h"
#include "utf8_strings.h"
litehtml::document::document(litehtml::document_container* objContainer, litehtml::context* ctx)
{
m_container = objContainer;
m_context = ctx;
}
litehtml::document::~document()
{
m_over_element = nullptr;
if(m_container)
{
for(auto & m_font : m_fonts)
{
m_container->delete_font(m_font.second.font);
}
}
}
litehtml::document::ptr litehtml::document::createFromString( const tchar_t* str, litehtml::document_container* objPainter, litehtml::context* ctx, litehtml::css* user_styles)
{
return createFromUTF8(litehtml_to_utf8(str), objPainter, ctx, user_styles);
}
litehtml::document::ptr litehtml::document::createFromUTF8(const char* str, litehtml::document_container* objPainter, litehtml::context* ctx, litehtml::css* user_styles)
{
// parse document into GumboOutput
GumboOutput* output = gumbo_parse((const char*) str);
// Create litehtml::document
litehtml::document::ptr doc = std::make_shared<litehtml::document>(objPainter, ctx);
// Create litehtml::elements.
elements_vector root_elements;
doc->create_node(output->root, root_elements, true);
if (!root_elements.empty())
{
doc->m_root = root_elements.back();
}
// Destroy GumboOutput
gumbo_destroy_output(&kGumboDefaultOptions, output);
// Let's process created elements tree
if (doc->m_root)
{
doc->container()->get_media_features(doc->m_media);
doc->m_root->set_pseudo_class(_t("root"), true);
// apply master CSS
doc->m_root->apply_stylesheet(ctx->master_css());
// parse elements attributes
doc->m_root->parse_attributes();
// parse style sheets linked in document
media_query_list::ptr media;
for (const auto& css : doc->m_css)
{
if (!css.media.empty())
{
media = media_query_list::create_from_string(css.media, doc);
}
else
{
media = nullptr;
}
doc->m_styles.parse_stylesheet(css.text.c_str(), css.baseurl.c_str(), doc, media);
}
// Sort css selectors using CSS rules.
doc->m_styles.sort_selectors();
// get current media features
if (!doc->m_media_lists.empty())
{
doc->update_media_lists(doc->m_media);
}
// Apply parsed styles.
doc->m_root->apply_stylesheet(doc->m_styles);
// Apply user styles if any
if (user_styles)
{
doc->m_root->apply_stylesheet(*user_styles);
}
// Parse applied styles in the elements
doc->m_root->parse_styles();
// Now the m_tabular_elements is filled with tabular elements.
// We have to check the tabular elements for missing table elements
// and create the anonymous boxes in visual table layout
doc->fix_tables_layout();
// Fanaly initialize elements
doc->m_root->init();
}
return doc;
}
litehtml::uint_ptr litehtml::document::add_font( const tchar_t* name, int size, const tchar_t* weight, const tchar_t* style, const tchar_t* decoration, font_metrics* fm )
{
uint_ptr ret = 0;
if(!name || !t_strcasecmp(name, _t("inherit")))
{
name = m_container->get_default_font_name();
}
if(!size)
{
size = container()->get_default_font_size();
}
tchar_t strSize[20];
t_itoa(size, strSize, 20, 10);
tstring key = name;
key += _t(":");
key += strSize;
key += _t(":");
key += weight;
key += _t(":");
key += style;
key += _t(":");
key += decoration;
if(m_fonts.find(key) == m_fonts.end())
{
font_style fs = (font_style) value_index(style, font_style_strings, fontStyleNormal);
int fw = value_index(weight, font_weight_strings, -1);
if(fw >= 0)
{
switch(fw)
{
case litehtml::fontWeightBold:
fw = 700;
break;
case litehtml::fontWeightBolder:
fw = 600;
break;
case litehtml::fontWeightLighter:
fw = 300;
break;
default:
fw = 400;
break;
}
} else
{
fw = t_atoi(weight);
if(fw < 100)
{
fw = 400;
}
}
unsigned int decor = 0;
if(decoration)
{
std::vector<tstring> tokens;
split_string(decoration, tokens, _t(" "));
for(auto & token : tokens)
{
if(!t_strcasecmp(token.c_str(), _t("underline")))
{
decor |= font_decoration_underline;
} else if(!t_strcasecmp(token.c_str(), _t("line-through")))
{
decor |= font_decoration_linethrough;
} else if(!t_strcasecmp(token.c_str(), _t("overline")))
{
decor |= font_decoration_overline;
}
}
}
font_item fi= {0};
fi.font = m_container->create_font(name, size, fw, fs, decor, &fi.metrics);
m_fonts[key] = fi;
ret = fi.font;
if(fm)
{
*fm = fi.metrics;
}
}
return ret;
}
litehtml::uint_ptr litehtml::document::get_font( const tchar_t* name, int size, const tchar_t* weight, const tchar_t* style, const tchar_t* decoration, font_metrics* fm )
{
if(!name || !t_strcasecmp(name, _t("inherit")))
{
name = m_container->get_default_font_name();
}
if(!size)
{
size = m_container->get_default_font_size();
}
tchar_t strSize[20];
t_itoa(size, strSize, 20, 10);
tstring key = name;
key += _t(":");
key += strSize;
key += _t(":");
key += weight;
key += _t(":");
key += style;
key += _t(":");
key += decoration;
auto el = m_fonts.find(key);
if(el != m_fonts.end())
{
if(fm)
{
*fm = el->second.metrics;
}
return el->second.font;
}
return add_font(name, size, weight, style, decoration, fm);
}
int litehtml::document::render( int max_width, render_type rt )
{
int ret = 0;
if(m_root)
{
if(rt == render_fixed_only)
{
m_fixed_boxes.clear();
m_root->render_positioned(rt);
} else
{
ret = m_root->render(0, 0, max_width);
if(m_root->fetch_positioned())
{
m_fixed_boxes.clear();
m_root->render_positioned(rt);
}
m_size.width = 0;
m_size.height = 0;
m_root->calc_document_size(m_size);
}
}
return ret;
}
void litehtml::document::draw( uint_ptr hdc, int x, int y, const position* clip )
{
if(m_root)
{
m_root->draw(hdc, x, y, clip);
m_root->draw_stacking_context(hdc, x, y, clip, true);
}
}
int litehtml::document::cvt_units( const tchar_t* str, int fontSize, bool* is_percent/*= 0*/ ) const
{
if(!str) return 0;
css_length val;
val.fromString(str);
if(is_percent && val.units() == css_units_percentage && !val.is_predefined())
{
*is_percent = true;
}
return cvt_units(val, fontSize);
}
int litehtml::document::cvt_units( css_length& val, int fontSize, int size ) const
{
if(val.is_predefined())
{
return 0;
}
int ret;
switch(val.units())
{
case css_units_percentage:
ret = val.calc_percent(size);
break;
case css_units_em:
ret = round_f(val.val() * (float) fontSize);
val.set_value((float) ret, css_units_px);
break;
case css_units_pt:
ret = m_container->pt_to_px((int) val.val());
val.set_value((float) ret, css_units_px);
break;
case css_units_in:
ret = m_container->pt_to_px((int) (val.val() * 72));
val.set_value((float) ret, css_units_px);
break;
case css_units_cm:
ret = m_container->pt_to_px((int) (val.val() * 0.3937 * 72));
val.set_value((float) ret, css_units_px);
break;
case css_units_mm:
ret = m_container->pt_to_px((int) (val.val() * 0.3937 * 72) / 10);
val.set_value((float) ret, css_units_px);
break;
case css_units_vw:
ret = (int)((double)m_media.width * (double)val.val() / 100.0);
break;
case css_units_vh:
ret = (int)((double)m_media.height * (double)val.val() / 100.0);
break;
case css_units_vmin:
ret = (int)((double)std::min(m_media.height, m_media.width) * (double)val.val() / 100.0);
break;
case css_units_vmax:
ret = (int)((double)std::max(m_media.height, m_media.width) * (double)val.val() / 100.0);
break;
case css_units_rem:
ret = (int) ((double) m_root->get_font_size() * (double) val.val());
val.set_value((float) ret, css_units_px);
break;
default:
ret = (int) val.val();
break;
}
return ret;
}
int litehtml::document::width() const
{
return m_size.width;
}
int litehtml::document::height() const
{
return m_size.height;
}
void litehtml::document::add_stylesheet( const tchar_t* str, const tchar_t* baseurl, const tchar_t* media )
{
if(str && str[0])
{
m_css.push_back(css_text(str, baseurl, media));
}
}
bool litehtml::document::on_mouse_over( int x, int y, int client_x, int client_y, position::vector& redraw_boxes )
{
if(!m_root)
{
return false;
}
element::ptr over_el = m_root->get_element_by_point(x, y, client_x, client_y);
bool state_was_changed = false;
if(over_el != m_over_element)
{
if(m_over_element)
{
if(m_over_element->on_mouse_leave())
{
state_was_changed = true;
}
}
m_over_element = over_el;
}
const tchar_t* cursor = nullptr;
if(m_over_element)
{
if(m_over_element->on_mouse_over())
{
state_was_changed = true;
}
cursor = m_over_element->get_cursor();
}
m_container->set_cursor(cursor ? cursor : _t("auto"));
if(state_was_changed)
{
return m_root->find_styles_changes(redraw_boxes, 0, 0);
}
return false;
}
bool litehtml::document::on_mouse_leave( position::vector& redraw_boxes )
{
if(!m_root)
{
return false;
}
if(m_over_element)
{
if(m_over_element->on_mouse_leave())
{
return m_root->find_styles_changes(redraw_boxes, 0, 0);
}
}
return false;
}
bool litehtml::document::on_lbutton_down( int x, int y, int client_x, int client_y, position::vector& redraw_boxes )
{
if(!m_root)
{
return false;
}
element::ptr over_el = m_root->get_element_by_point(x, y, client_x, client_y);
bool state_was_changed = false;
if(over_el != m_over_element)
{
if(m_over_element)
{
if(m_over_element->on_mouse_leave())
{
state_was_changed = true;
}
}
m_over_element = over_el;
if(m_over_element)
{
if(m_over_element->on_mouse_over())
{
state_was_changed = true;
}
}
}
const tchar_t* cursor = nullptr;
if(m_over_element)
{
if(m_over_element->on_lbutton_down())
{
state_was_changed = true;
}
cursor = m_over_element->get_cursor();
}
m_container->set_cursor(cursor ? cursor : _t("auto"));
if(state_was_changed)
{
return m_root->find_styles_changes(redraw_boxes, 0, 0);
}
return false;
}
bool litehtml::document::on_lbutton_up( int x, int y, int client_x, int client_y, position::vector& redraw_boxes )
{
if(!m_root)
{
return false;
}
if(m_over_element)
{
if(m_over_element->on_lbutton_up())
{
return m_root->find_styles_changes(redraw_boxes, 0, 0);
}
}
return false;
}
litehtml::element::ptr litehtml::document::create_element(const tchar_t* tag_name, const string_map& attributes)
{
element::ptr newTag;
document::ptr this_doc = shared_from_this();
if(m_container)
{
newTag = m_container->create_element(tag_name, attributes, this_doc);
}
if(!newTag)
{
if(!t_strcmp(tag_name, _t("br")))
{
newTag = std::make_shared<litehtml::el_break>(this_doc);
} else if(!t_strcmp(tag_name, _t("p")))
{
newTag = std::make_shared<litehtml::el_para>(this_doc);
} else if(!t_strcmp(tag_name, _t("img")))
{
newTag = std::make_shared<litehtml::el_image>(this_doc);
} else if(!t_strcmp(tag_name, _t("table")))
{
newTag = std::make_shared<litehtml::el_table>(this_doc);
} else if(!t_strcmp(tag_name, _t("td")) || !t_strcmp(tag_name, _t("th")))
{
newTag = std::make_shared<litehtml::el_td>(this_doc);
} else if(!t_strcmp(tag_name, _t("link")))
{
newTag = std::make_shared<litehtml::el_link>(this_doc);
} else if(!t_strcmp(tag_name, _t("title")))
{
newTag = std::make_shared<litehtml::el_title>(this_doc);
} else if(!t_strcmp(tag_name, _t("a")))
{
newTag = std::make_shared<litehtml::el_anchor>(this_doc);
} else if(!t_strcmp(tag_name, _t("tr")))
{
newTag = std::make_shared<litehtml::el_tr>(this_doc);
} else if(!t_strcmp(tag_name, _t("style")))
{
newTag = std::make_shared<litehtml::el_style>(this_doc);
} else if(!t_strcmp(tag_name, _t("base")))
{
newTag = std::make_shared<litehtml::el_base>(this_doc);
} else if(!t_strcmp(tag_name, _t("body")))
{
newTag = std::make_shared<litehtml::el_body>(this_doc);
} else if(!t_strcmp(tag_name, _t("div")))
{
newTag = std::make_shared<litehtml::el_div>(this_doc);
} else if(!t_strcmp(tag_name, _t("script")))
{
newTag = std::make_shared<litehtml::el_script>(this_doc);
} else if(!t_strcmp(tag_name, _t("font")))
{
newTag = std::make_shared<litehtml::el_font>(this_doc);
} else if(!t_strcmp(tag_name, _t("li")))
{
newTag = std::make_shared<litehtml::el_li>(this_doc);
} else
{
newTag = std::make_shared<litehtml::html_tag>(this_doc);
}
}
if(newTag)
{
newTag->set_tagName(tag_name);
for (const auto & attribute : attributes)
{
newTag->set_attr(attribute.first.c_str(), attribute.second.c_str());
}
}
return newTag;
}
void litehtml::document::get_fixed_boxes( position::vector& fixed_boxes )
{
fixed_boxes = m_fixed_boxes;
}
void litehtml::document::add_fixed_box( const position& pos )
{
m_fixed_boxes.push_back(pos);
}
bool litehtml::document::media_changed()
{
container()->get_media_features(m_media);
if (update_media_lists(m_media))
{
m_root->refresh_styles();
m_root->parse_styles();
return true;
}
return false;
}
bool litehtml::document::lang_changed()
{
if(!m_media_lists.empty())
{
tstring culture;
container()->get_language(m_lang, culture);
if(!culture.empty())
{
m_culture = m_lang + _t('-') + culture;
}
else
{
m_culture.clear();
}
m_root->refresh_styles();
m_root->parse_styles();
return true;
}
return false;
}
bool litehtml::document::update_media_lists(const media_features& features)
{
bool update_styles = false;
for(auto & m_media_list : m_media_lists)
{
if(m_media_list->apply_media_features(features))
{
update_styles = true;
}
}
return update_styles;
}
void litehtml::document::add_media_list( const media_query_list::ptr& list )
{
if(list)
{
if(std::find(m_media_lists.begin(), m_media_lists.end(), list) == m_media_lists.end())
{
m_media_lists.push_back(list);
}
}
}
void litehtml::document::create_node(void* gnode, elements_vector& elements, bool parseTextNode)
{
auto* node = (GumboNode*)gnode;
switch (node->type)
{
case GUMBO_NODE_ELEMENT:
{
string_map attrs;
GumboAttribute* attr;
for (unsigned int i = 0; i < node->v.element.attributes.length; i++)
{
attr = (GumboAttribute*)node->v.element.attributes.data[i];
attrs[tstring(litehtml_from_utf8(attr->name))] = litehtml_from_utf8(attr->value);
}
element::ptr ret;
const char* tag = gumbo_normalized_tagname(node->v.element.tag);
if (tag[0])
{
ret = create_element(litehtml_from_utf8(tag), attrs);
}
else
{
if (node->v.element.original_tag.data && node->v.element.original_tag.length)
{
std::string strA;
gumbo_tag_from_original_text(&node->v.element.original_tag);
strA.append(node->v.element.original_tag.data, node->v.element.original_tag.length);
ret = create_element(litehtml_from_utf8(strA.c_str()), attrs);
}
}
if (!strcmp(tag, "script"))
{
parseTextNode = false;
}
if (ret)
{
elements_vector child;
for (unsigned int i = 0; i < node->v.element.children.length; i++)
{
child.clear();
create_node(static_cast<GumboNode*> (node->v.element.children.data[i]), child, parseTextNode);
std::for_each(child.begin(), child.end(),
[&ret](element::ptr& el)
{
ret->appendChild(el);
}
);
}
elements.push_back(ret);
}
}
break;
case GUMBO_NODE_TEXT:
{
std::wstring str;
std::wstring str_in = (const wchar_t*) (utf8_to_wchar(node->v.text.text));
if (!parseTextNode)
{
elements.push_back(std::make_shared<el_text>(litehtml_from_wchar(str_in).c_str(), shared_from_this()));
}
else
{
m_container->split_text(node->v.text.text,
[this, &elements](const tchar_t* text) { elements.push_back(std::make_shared<el_text>(text, shared_from_this())); },
[this, &elements](const tchar_t* text) { elements.push_back(std::make_shared<el_space>(text, shared_from_this())); });
}
}
break;
case GUMBO_NODE_CDATA:
{
element::ptr ret = std::make_shared<el_cdata>(shared_from_this());
ret->set_data(litehtml_from_utf8(node->v.text.text));
elements.push_back(ret);
}
break;
case GUMBO_NODE_COMMENT:
{
element::ptr ret = std::make_shared<el_comment>(shared_from_this());
ret->set_data(litehtml_from_utf8(node->v.text.text));
elements.push_back(ret);
}
break;
case GUMBO_NODE_WHITESPACE:
{
tstring str = litehtml_from_utf8(node->v.text.text);
for (size_t i = 0; i < str.length(); i++)
{
elements.push_back(std::make_shared<el_space>(str.substr(i, 1).c_str(), shared_from_this()));
}
}
break;
default:
break;
}
}
void litehtml::document::fix_tables_layout()
{
size_t i = 0;
while (i < m_tabular_elements.size())
{
element::ptr el_ptr = m_tabular_elements[i];
switch (el_ptr->get_display())
{
case display_inline_table:
case display_table:
fix_table_children(el_ptr, display_table_row_group, _t("table-row-group"));
break;
case display_table_footer_group:
case display_table_row_group:
case display_table_header_group:
{
element::ptr parent = el_ptr->parent();
if (parent)
{
if (parent->get_display() != display_inline_table)
fix_table_parent(el_ptr, display_table, _t("table"));
}
fix_table_children(el_ptr, display_table_row, _t("table-row"));
}
break;
case display_table_row:
fix_table_parent(el_ptr, display_table_row_group, _t("table-row-group"));
fix_table_children(el_ptr, display_table_cell, _t("table-cell"));
break;
case display_table_cell:
fix_table_parent(el_ptr, display_table_row, _t("table-row"));
break;
// TODO: make table layout fix for table-caption, table-column etc. elements
case display_table_caption:
case display_table_column:
case display_table_column_group:
default:
break;
}
i++;
}
}
void litehtml::document::fix_table_children(element::ptr& el_ptr, style_display disp, const tchar_t* disp_str)
{
elements_vector tmp;
auto first_iter = el_ptr->m_children.begin();
auto cur_iter = el_ptr->m_children.begin();
auto flush_elements = [&]()
{
element::ptr annon_tag = std::make_shared<html_tag>(shared_from_this());
annon_tag->add_style(tstring(_t("display:")) + disp_str, _t(""));
annon_tag->parent(el_ptr);
annon_tag->parse_styles();
std::for_each(tmp.begin(), tmp.end(),
[&annon_tag](element::ptr& el)
{
annon_tag->appendChild(el);
}
);
first_iter = el_ptr->m_children.insert(first_iter, annon_tag);
cur_iter = first_iter + 1;
while (cur_iter != el_ptr->m_children.end() && (*cur_iter)->parent() != el_ptr)
{
cur_iter = el_ptr->m_children.erase(cur_iter);
}
first_iter = cur_iter;
tmp.clear();
};
while (cur_iter != el_ptr->m_children.end())
{
if ((*cur_iter)->get_display() != disp)
{
if (!(*cur_iter)->is_table_skip() || ((*cur_iter)->is_table_skip() && !tmp.empty()))
{
if (disp != display_table_row_group || (*cur_iter)->get_display() != display_table_caption)
{
if (tmp.empty())
{
first_iter = cur_iter;
}
tmp.push_back((*cur_iter));
}
}
cur_iter++;
}
else if (!tmp.empty())
{
flush_elements();
}
else
{
cur_iter++;
}
}
if (!tmp.empty())
{
flush_elements();
}
}
void litehtml::document::fix_table_parent(element::ptr& el_ptr, style_display disp, const tchar_t* disp_str)
{
element::ptr parent = el_ptr->parent();
if (parent->get_display() != disp)
{
auto this_element = std::find_if(parent->m_children.begin(), parent->m_children.end(),
[&](element::ptr& el)
{
if (el == el_ptr)
{
return true;
}
return false;
}
);
if (this_element != parent->m_children.end())
{
style_display el_disp = el_ptr->get_display();
auto first = this_element;
auto last = this_element;
auto cur = this_element;
// find first element with same display
while (true)
{
if (cur == parent->m_children.begin()) break;
cur--;
if ((*cur)->is_table_skip() || (*cur)->get_display() == el_disp)
{
first = cur;
}
else
{
break;
}
}
// find last element with same display
cur = this_element;
while (true)
{
cur++;
if (cur == parent->m_children.end()) break;
if ((*cur)->is_table_skip() || (*cur)->get_display() == el_disp)
{
last = cur;
}
else
{
break;
}
}
// extract elements with the same display and wrap them with anonymous object
element::ptr annon_tag = std::make_shared<html_tag>(shared_from_this());
annon_tag->add_style(tstring(_t("display:")) + disp_str, _t(""));
annon_tag->parent(parent);
annon_tag->parse_styles();
std::for_each(first, last + 1,
[&annon_tag](element::ptr& el)
{
annon_tag->appendChild(el);
}
);
first = parent->m_children.erase(first, last + 1);
parent->m_children.insert(first, annon_tag);
}
}
}
void litehtml::document::append_children_from_string(element& parent, const tchar_t* str)
{
append_children_from_utf8(parent, litehtml_to_utf8(str));
}
void litehtml::document::append_children_from_utf8(element& parent, const char* str)
{
// parent must belong to this document
if (parent.get_document().get() != this)
{
return;
}
// parse document into GumboOutput
GumboOutput* output = gumbo_parse((const char*) str);
// Create litehtml::elements.
elements_vector child_elements;
create_node(output->root, child_elements, true);
// Destroy GumboOutput
gumbo_destroy_output(&kGumboDefaultOptions, output);
// Let's process created elements tree
for (const auto& child : child_elements)
{
// Add the child element to parent
parent.appendChild(child);
// apply master CSS
child->apply_stylesheet(m_context->master_css());
// parse elements attributes
child->parse_attributes();
// Apply parsed styles.
child->apply_stylesheet(m_styles);
// Parse applied styles in the elements
child->parse_styles();
// Now the m_tabular_elements is filled with tabular elements.
// We have to check the tabular elements for missing table elements
// and create the anonymous boxes in visual table layout
fix_tables_layout();
// Fanaly initialize elements
child->init();
}
}

View File

@ -0,0 +1,26 @@
#include "html.h"
#include "el_anchor.h"
#include "document.h"
litehtml::el_anchor::el_anchor(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
}
void litehtml::el_anchor::on_click()
{
const tchar_t* href = get_attr(_t("href"));
if(href)
{
get_document()->container()->on_anchor_click(href, shared_from_this());
}
}
void litehtml::el_anchor::apply_stylesheet( const litehtml::css& stylesheet )
{
if( get_attr(_t("href")) )
{
m_pseudo_classes.push_back(_t("link"));
}
html_tag::apply_stylesheet(stylesheet);
}

View File

@ -0,0 +1,13 @@
#include "html.h"
#include "el_base.h"
#include "document.h"
litehtml::el_base::el_base(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
}
void litehtml::el_base::parse_attributes()
{
get_document()->container()->set_base_url(get_attr(_t("href")));
}

View File

@ -0,0 +1,207 @@
#include "html.h"
#include "el_before_after.h"
#include "el_text.h"
#include "el_space.h"
#include "el_image.h"
#include "utf8_strings.h"
litehtml::el_before_after_base::el_before_after_base(const std::shared_ptr<litehtml::document>& doc, bool before) : html_tag(doc)
{
if(before)
{
m_tag = _t("::before");
} else
{
m_tag = _t("::after");
}
}
void litehtml::el_before_after_base::add_style(const tstring& style, const tstring& baseurl)
{
html_tag::add_style(style, baseurl);
auto children = m_children;
m_children.clear();
tstring content = get_style_property(_t("content"), false, _t(""));
if(!content.empty())
{
int idx = value_index(content, content_property_string);
if(idx < 0)
{
tstring fnc;
tstring::size_type i = 0;
while(i < content.length() && i != tstring::npos)
{
if(content.at(i) == _t('"'))
{
fnc.clear();
i++;
tstring::size_type pos = content.find(_t('"'), i);
tstring txt;
if(pos == tstring::npos)
{
txt = content.substr(i);
i = tstring::npos;
} else
{
txt = content.substr(i, pos - i);
i = pos + 1;
}
add_text(txt);
} else if(content.at(i) == _t('('))
{
i++;
litehtml::trim(fnc);
litehtml::lcase(fnc);
tstring::size_type pos = content.find(_t(')'), i);
tstring params;
if(pos == tstring::npos)
{
params = content.substr(i);
i = tstring::npos;
} else
{
params = content.substr(i, pos - i);
i = pos + 1;
}
add_function(fnc, params);
fnc.clear();
} else
{
fnc += content.at(i);
i++;
}
}
}
}
if(m_children.empty())
{
m_children = children;
}
}
void litehtml::el_before_after_base::add_text( const tstring& txt )
{
tstring word;
tstring esc;
for(tstring::size_type i = 0; i < txt.length(); i++)
{
if( (txt.at(i) == _t(' ')) || (txt.at(i) == _t('\t')) || (txt.at(i) == _t('\\') && !esc.empty()) )
{
if(esc.empty())
{
if(!word.empty())
{
element::ptr el = std::make_shared<el_text>(word.c_str(), get_document());
appendChild(el);
word.clear();
}
element::ptr el = std::make_shared<el_space>(txt.substr(i, 1).c_str(), get_document());
appendChild(el);
} else
{
word += convert_escape(esc.c_str() + 1);
esc.clear();
if(txt.at(i) == _t('\\'))
{
esc += txt.at(i);
}
}
} else
{
if(!esc.empty() || txt.at(i) == _t('\\'))
{
esc += txt.at(i);
} else
{
word += txt.at(i);
}
}
}
if(!esc.empty())
{
word += convert_escape(esc.c_str() + 1);
}
if(!word.empty())
{
element::ptr el = std::make_shared<el_text>(word.c_str(), get_document());
appendChild(el);
word.clear();
}
}
void litehtml::el_before_after_base::add_function( const tstring& fnc, const tstring& params )
{
int idx = value_index(fnc, _t("attr;counter;url"));
switch(idx)
{
// attr
case 0:
{
tstring p_name = params;
trim(p_name);
lcase(p_name);
element::ptr el_parent = parent();
if (el_parent)
{
const tchar_t* attr_value = el_parent->get_attr(p_name.c_str());
if (attr_value)
{
add_text(attr_value);
}
}
}
break;
// counter
case 1:
break;
// url
case 2:
{
tstring p_url = params;
trim(p_url);
if(!p_url.empty())
{
if(p_url.at(0) == _t('\'') || p_url.at(0) == _t('\"'))
{
p_url.erase(0, 1);
}
}
if(!p_url.empty())
{
if(p_url.at(p_url.length() - 1) == _t('\'') || p_url.at(p_url.length() - 1) == _t('\"'))
{
p_url.erase(p_url.length() - 1, 1);
}
}
if(!p_url.empty())
{
element::ptr el = std::make_shared<el_image>(get_document());
el->set_attr(_t("src"), p_url.c_str());
el->set_attr(_t("style"), _t("display:inline-block"));
el->set_tagName(_t("img"));
appendChild(el);
el->parse_attributes();
}
}
break;
}
}
litehtml::tstring litehtml::el_before_after_base::convert_escape( const tchar_t* txt )
{
tchar_t* str_end;
wchar_t u_str[2];
u_str[0] = (wchar_t) t_strtol(txt, &str_end, 16);
u_str[1] = 0;
return litehtml::tstring(litehtml_from_wchar(u_str));
}
void litehtml::el_before_after_base::apply_stylesheet( const litehtml::css& stylesheet )
{
}

View File

@ -0,0 +1,12 @@
#include "html.h"
#include "el_body.h"
#include "document.h"
litehtml::el_body::el_body(const std::shared_ptr<litehtml::document>& doc) : litehtml::html_tag(doc)
{
}
bool litehtml::el_body::is_body() const
{
return true;
}

View File

@ -0,0 +1,13 @@
#include "html.h"
#include "el_break.h"
litehtml::el_break::el_break(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
}
bool litehtml::el_break::is_break() const
{
return true;
}

View File

@ -0,0 +1,20 @@
#include "html.h"
#include "el_cdata.h"
litehtml::el_cdata::el_cdata(const std::shared_ptr<litehtml::document>& doc) : litehtml::element(doc)
{
m_skip = true;
}
void litehtml::el_cdata::get_text( tstring& text )
{
text += m_text;
}
void litehtml::el_cdata::set_data( const tchar_t* data )
{
if(data)
{
m_text += data;
}
}

View File

@ -0,0 +1,25 @@
#include "html.h"
#include "el_comment.h"
litehtml::el_comment::el_comment(const std::shared_ptr<litehtml::document>& doc) : litehtml::element(doc)
{
m_skip = true;
}
bool litehtml::el_comment::is_comment() const
{
return true;
}
void litehtml::el_comment::get_text( tstring& text )
{
text += m_text;
}
void litehtml::el_comment::set_data( const tchar_t* data )
{
if(data)
{
m_text += data;
}
}

View File

@ -0,0 +1,18 @@
#include "html.h"
#include "el_div.h"
litehtml::el_div::el_div(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
}
void litehtml::el_div::parse_attributes()
{
const tchar_t* str = get_attr(_t("align"));
if(str)
{
m_style.add_property(_t("text-align"), str, 0, false, this);
}
html_tag::parse_attributes();
}

View File

@ -0,0 +1,55 @@
#include "html.h"
#include "el_font.h"
litehtml::el_font::el_font(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
}
void litehtml::el_font::parse_attributes()
{
const tchar_t* str = get_attr(_t("color"));
if(str)
{
m_style.add_property(_t("color"), str, nullptr, false, this);
}
str = get_attr(_t("face"));
if(str)
{
m_style.add_property(_t("font-face"), str, nullptr, false, this);
}
str = get_attr(_t("size"));
if(str)
{
int sz = t_atoi(str);
if(sz <= 1)
{
m_style.add_property(_t("font-size"), _t("x-small"), nullptr, false, this);
} else if(sz >= 6)
{
m_style.add_property(_t("font-size"), _t("xx-large"), nullptr, false, this);
} else
{
switch(sz)
{
case 2:
m_style.add_property(_t("font-size"), _t("small"), nullptr, false, this);
break;
case 3:
m_style.add_property(_t("font-size"), _t("medium"), nullptr, false, this);
break;
case 4:
m_style.add_property(_t("font-size"), _t("large"), nullptr, false, this);
break;
case 5:
m_style.add_property(_t("font-size"), _t("x-large"), nullptr, false, this);
break;
}
}
}
html_tag::parse_attributes();
}

View File

@ -0,0 +1,273 @@
#include "html.h"
#include "el_image.h"
#include "document.h"
litehtml::el_image::el_image(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
m_display = display_inline_block;
}
litehtml::el_image::~el_image( void )
{
}
void litehtml::el_image::get_content_size( size& sz, int max_width )
{
get_document()->container()->get_image_size(m_src.c_str(), 0, sz);
}
int litehtml::el_image::calc_max_height(int image_height)
{
document::ptr doc = get_document();
int percentSize = 0;
if (m_css_max_height.units() == css_units_percentage)
{
auto el_parent = parent();
if (el_parent)
{
if (!el_parent->get_predefined_height(percentSize))
{
return image_height;
}
}
}
return doc->cvt_units(m_css_max_height, m_font_size, percentSize);
}
int litehtml::el_image::line_height() const
{
return height();
}
bool litehtml::el_image::is_replaced() const
{
return true;
}
int litehtml::el_image::render( int x, int y, int max_width, bool second_pass )
{
int parent_width = max_width;
calc_outlines(parent_width);
m_pos.move_to(x, y);
document::ptr doc = get_document();
litehtml::size sz;
doc->container()->get_image_size(m_src.c_str(), 0, sz);
m_pos.width = sz.width;
m_pos.height = sz.height;
if(m_css_height.is_predefined() && m_css_width.is_predefined())
{
m_pos.height = sz.height;
m_pos.width = sz.width;
// check for max-width
if(!m_css_max_width.is_predefined())
{
int max_width = doc->cvt_units(m_css_max_width, m_font_size, parent_width);
if(m_pos.width > max_width)
{
m_pos.width = max_width;
}
if(sz.width)
{
m_pos.height = (int) ((float) m_pos.width * (float) sz.height / (float)sz.width);
} else
{
m_pos.height = sz.height;
}
}
// check for max-height
if(!m_css_max_height.is_predefined())
{
int max_height = calc_max_height(sz.height);
if(m_pos.height > max_height)
{
m_pos.height = max_height;
}
if(sz.height)
{
m_pos.width = (int) (m_pos.height * (float)sz.width / (float)sz.height);
} else
{
m_pos.width = sz.width;
}
}
} else if(!m_css_height.is_predefined() && m_css_width.is_predefined())
{
if (!get_predefined_height(m_pos.height))
{
m_pos.height = (int)m_css_height.val();
}
// check for max-height
if(!m_css_max_height.is_predefined())
{
int max_height = calc_max_height(sz.height);
if(m_pos.height > max_height)
{
m_pos.height = max_height;
}
}
if(sz.height)
{
m_pos.width = (int) (m_pos.height * (float)sz.width / (float)sz.height);
} else
{
m_pos.width = sz.width;
}
} else if(m_css_height.is_predefined() && !m_css_width.is_predefined())
{
m_pos.width = (int) m_css_width.calc_percent(parent_width);
// check for max-width
if(!m_css_max_width.is_predefined())
{
int max_width = doc->cvt_units(m_css_max_width, m_font_size, parent_width);
if(m_pos.width > max_width)
{
m_pos.width = max_width;
}
}
if(sz.width)
{
m_pos.height = (int) ((float) m_pos.width * (float) sz.height / (float)sz.width);
} else
{
m_pos.height = sz.height;
}
} else
{
m_pos.width = (int) m_css_width.calc_percent(parent_width);
m_pos.height = 0;
if (!get_predefined_height(m_pos.height))
{
m_pos.height = (int)m_css_height.val();
}
// check for max-height
if(!m_css_max_height.is_predefined())
{
int max_height = calc_max_height(sz.height);
if(m_pos.height > max_height)
{
m_pos.height = max_height;
}
}
// check for max-height
if(!m_css_max_width.is_predefined())
{
int max_width = doc->cvt_units(m_css_max_width, m_font_size, parent_width);
if(m_pos.width > max_width)
{
m_pos.width = max_width;
}
}
}
calc_auto_margins(parent_width);
m_pos.x += content_margins_left();
m_pos.y += content_margins_top();
return m_pos.width + content_margins_left() + content_margins_right();
}
void litehtml::el_image::parse_attributes()
{
m_src = get_attr(_t("src"), _t(""));
const tchar_t* attr_height = get_attr(_t("height"));
if(attr_height)
{
m_style.add_property(_t("height"), attr_height, 0, false, this);
}
const tchar_t* attr_width = get_attr(_t("width"));
if(attr_width)
{
m_style.add_property(_t("width"), attr_width, 0, false, this);
}
}
void litehtml::el_image::draw( uint_ptr hdc, int x, int y, const position* clip )
{
position pos = m_pos;
pos.x += x;
pos.y += y;
position el_pos = pos;
el_pos += m_padding;
el_pos += m_borders;
// draw standard background here
if (el_pos.does_intersect(clip))
{
const background* bg = get_background();
if (bg)
{
background_paint bg_paint;
init_background_paint(pos, bg_paint, bg);
get_document()->container()->draw_background(hdc, bg_paint);
}
}
// draw image as background
if(pos.does_intersect(clip))
{
if (pos.width > 0 && pos.height > 0) {
background_paint bg;
bg.image = m_src;
bg.clip_box = pos;
bg.origin_box = pos;
bg.border_box = pos;
bg.border_box += m_padding;
bg.border_box += m_borders;
bg.repeat = background_repeat_no_repeat;
bg.image_size.width = pos.width;
bg.image_size.height = pos.height;
bg.border_radius = m_css_borders.radius.calc_percents(bg.border_box.width, bg.border_box.height);
bg.position_x = pos.x;
bg.position_y = pos.y;
get_document()->container()->draw_background(hdc, bg);
}
}
// draw borders
if (el_pos.does_intersect(clip))
{
position border_box = pos;
border_box += m_padding;
border_box += m_borders;
borders bdr = m_css_borders;
bdr.radius = m_css_borders.radius.calc_percents(border_box.width, border_box.height);
get_document()->container()->draw_borders(hdc, bdr, border_box, !have_parent());
}
}
void litehtml::el_image::parse_styles( bool is_reparse /*= false*/ )
{
html_tag::parse_styles(is_reparse);
if(!m_src.empty())
{
if(!m_css_height.is_predefined() && !m_css_width.is_predefined())
{
get_document()->container()->load_image(m_src.c_str(), nullptr, true);
} else
{
get_document()->container()->load_image(m_src.c_str(), nullptr, false);
}
}
}

View File

@ -0,0 +1,35 @@
#include "html.h"
#include "el_li.h"
#include "document.h"
litehtml::el_li::el_li(const std::shared_ptr<litehtml::document>& doc) : litehtml::html_tag(doc)
{
}
int litehtml::el_li::render(int x, int y, int max_width, bool second_pass)
{
if (m_list_style_type >= list_style_type_armenian && !m_index_initialized)
{
if (auto p = parent())
{
const auto hasStart = p->get_attr(_t("start"));
const int start = hasStart ? t_atoi(hasStart) : 1;
tchar_t val[2] = { (tchar_t)start, 0 };
for (int i = 0, n = (int)p->get_children_count(); i < n; ++i)
{
auto child = p->get_child(i);
if (child.get() == this)
{
set_attr(_t("list_index"), val);
break;
}
else if (!t_strcmp(child->get_tagName(), _t("li")))
++val[0];
}
}
m_index_initialized = true;
}
return html_tag::render(x, y, max_width, second_pass);
}

View File

@ -0,0 +1,39 @@
#include "html.h"
#include "el_link.h"
#include "document.h"
litehtml::el_link::el_link(const std::shared_ptr<litehtml::document>& doc) : litehtml::html_tag(doc)
{
}
void litehtml::el_link::parse_attributes()
{
bool processed = false;
document::ptr doc = get_document();
const tchar_t* rel = get_attr(_t("rel"));
if(rel && !t_strcmp(rel, _t("stylesheet")))
{
const tchar_t* media = get_attr(_t("media"));
const tchar_t* href = get_attr(_t("href"));
if(href && href[0])
{
tstring css_text;
tstring css_baseurl;
doc->container()->import_css(css_text, href, css_baseurl);
if(!css_text.empty())
{
doc->add_stylesheet(css_text.c_str(), css_baseurl.c_str(), media);
processed = true;
}
}
}
if(!processed)
{
doc->container()->link(doc, shared_from_this());
}
}

View File

@ -0,0 +1,18 @@
#include "html.h"
#include "el_para.h"
#include "document.h"
litehtml::el_para::el_para(const std::shared_ptr<litehtml::document>& doc) : litehtml::html_tag(doc)
{
}
void litehtml::el_para::parse_attributes()
{
const tchar_t* str = get_attr(_t("align"));
if(str)
{
m_style.add_property(_t("text-align"), str, nullptr, false, this);
}
html_tag::parse_attributes();
}

View File

@ -0,0 +1,25 @@
#include "html.h"
#include "el_script.h"
#include "document.h"
litehtml::el_script::el_script(const std::shared_ptr<litehtml::document>& doc) : litehtml::element(doc)
{
}
void litehtml::el_script::parse_attributes()
{
//TODO: pass script text to document container
}
bool litehtml::el_script::appendChild(const ptr &el)
{
el->get_text(m_text);
return true;
}
const litehtml::tchar_t* litehtml::el_script::get_tagName() const
{
return _t("script");
}

View File

@ -0,0 +1,40 @@
#include "html.h"
#include "document.h"
#include "el_space.h"
litehtml::el_space::el_space(const tchar_t* text, const std::shared_ptr<litehtml::document>& doc) : el_text(text, doc)
{
}
bool litehtml::el_space::is_white_space() const
{
white_space ws = get_white_space();
if( ws == white_space_normal ||
ws == white_space_nowrap ||
ws == white_space_pre_line )
{
return true;
}
return false;
}
bool litehtml::el_space::is_break() const
{
white_space ws = get_white_space();
if( ws == white_space_pre ||
ws == white_space_pre_line ||
ws == white_space_pre_wrap)
{
if(m_text == _t("\n"))
{
return true;
}
}
return false;
}
bool litehtml::el_space::is_space() const
{
return true;
}

View File

@ -0,0 +1,31 @@
#include "html.h"
#include "el_style.h"
#include "document.h"
litehtml::el_style::el_style(const std::shared_ptr<litehtml::document>& doc) : litehtml::element(doc)
{
}
void litehtml::el_style::parse_attributes()
{
tstring text;
for(auto& el : m_children)
{
el->get_text(text);
}
get_document()->add_stylesheet( text.c_str(), nullptr, get_attr(_t("media")) );
}
bool litehtml::el_style::appendChild(const ptr &el)
{
m_children.push_back(el);
return true;
}
const litehtml::tchar_t* litehtml::el_style::get_tagName() const
{
return _t("style");
}

View File

@ -0,0 +1,105 @@
#include "html.h"
#include "el_table.h"
#include "document.h"
#include "iterators.h"
litehtml::el_table::el_table(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
m_border_spacing_x = 0;
m_border_spacing_y = 0;
m_border_collapse = border_collapse_separate;
}
bool litehtml::el_table::appendChild(const litehtml::element::ptr& el)
{
if(!el) return false;
if( !t_strcmp(el->get_tagName(), _t("tbody")) ||
!t_strcmp(el->get_tagName(), _t("thead")) ||
!t_strcmp(el->get_tagName(), _t("tfoot")) ||
!t_strcmp(el->get_tagName(), _t("caption")))
{
return html_tag::appendChild(el);
}
return false;
}
void litehtml::el_table::parse_styles(bool is_reparse)
{
html_tag::parse_styles(is_reparse);
m_border_collapse = (border_collapse) value_index(get_style_property(_t("border-collapse"), true, _t("separate")), border_collapse_strings, border_collapse_separate);
if(m_border_collapse == border_collapse_separate)
{
m_css_border_spacing_x.fromString(get_style_property(_t("-litehtml-border-spacing-x"), true, _t("0px")));
m_css_border_spacing_y.fromString(get_style_property(_t("-litehtml-border-spacing-y"), true, _t("0px")));
int fntsz = get_font_size();
document::ptr doc = get_document();
m_border_spacing_x = doc->cvt_units(m_css_border_spacing_x, fntsz);
m_border_spacing_y = doc->cvt_units(m_css_border_spacing_y, fntsz);
} else
{
m_border_spacing_x = 0;
m_border_spacing_y = 0;
m_padding.bottom = 0;
m_padding.top = 0;
m_padding.left = 0;
m_padding.right = 0;
m_css_padding.bottom.set_value(0, css_units_px);
m_css_padding.top.set_value(0, css_units_px);
m_css_padding.left.set_value(0, css_units_px);
m_css_padding.right.set_value(0, css_units_px);
}
}
void litehtml::el_table::parse_attributes()
{
const tchar_t* str = get_attr(_t("width"));
if(str)
{
m_style.add_property(_t("width"), str, nullptr, false, this);
}
str = get_attr(_t("align"));
if(str)
{
int align = value_index(str, _t("left;center;right"));
switch(align)
{
case 1:
m_style.add_property(_t("margin-left"), _t("auto"), nullptr, false, this);
m_style.add_property(_t("margin-right"), _t("auto"), nullptr, false, this);
break;
case 2:
m_style.add_property(_t("margin-left"), _t("auto"), nullptr, false, this);
m_style.add_property(_t("margin-right"), _t("0"), nullptr, false, this);
break;
}
}
str = get_attr(_t("cellspacing"));
if(str)
{
tstring val = str;
val += _t(" ");
val += str;
m_style.add_property(_t("border-spacing"), val.c_str(), nullptr, false, this);
}
str = get_attr(_t("border"));
if(str)
{
m_style.add_property(_t("border-width"), str, nullptr, false, this);
}
str = get_attr(_t("bgcolor"));
if (str)
{
m_style.add_property(_t("background-color"), str, nullptr, false, this);
}
html_tag::parse_attributes();
}

View File

@ -0,0 +1,44 @@
#include "html.h"
#include "el_td.h"
litehtml::el_td::el_td(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
}
void litehtml::el_td::parse_attributes()
{
const tchar_t* str = get_attr(_t("width"));
if(str)
{
m_style.add_property(_t("width"), str, nullptr, false, this);
}
str = get_attr(_t("background"));
if(str)
{
tstring url = _t("url('");
url += str;
url += _t("')");
m_style.add_property(_t("background-image"), url.c_str(), nullptr, false, this);
}
str = get_attr(_t("align"));
if(str)
{
m_style.add_property(_t("text-align"), str, nullptr, false, this);
}
str = get_attr(_t("bgcolor"));
if (str)
{
m_style.add_property(_t("background-color"), str, nullptr, false, this);
}
str = get_attr(_t("valign"));
if(str)
{
m_style.add_property(_t("vertical-align"), str, nullptr, false, this);
}
html_tag::parse_attributes();
}

View File

@ -0,0 +1,183 @@
#include "html.h"
#include "el_text.h"
#include "document.h"
litehtml::el_text::el_text(const tchar_t* text, const std::shared_ptr<litehtml::document>& doc) : element(doc)
{
if(text)
{
m_text = text;
}
m_text_transform = text_transform_none;
m_use_transformed = false;
m_draw_spaces = true;
}
void litehtml::el_text::get_content_size( size& sz, int max_width )
{
sz = m_size;
}
void litehtml::el_text::get_text( tstring& text )
{
text += m_text;
}
const litehtml::tchar_t* litehtml::el_text::get_style_property( const tchar_t* name, bool inherited, const tchar_t* def /*= 0*/ ) const
{
if(inherited)
{
element::ptr el_parent = parent();
if (el_parent)
{
return el_parent->get_style_property(name, inherited, def);
}
}
return def;
}
void litehtml::el_text::parse_styles(bool is_reparse)
{
m_text_transform = (text_transform) value_index(get_style_property(_t("text-transform"), true, _t("none")), text_transform_strings, text_transform_none);
if(m_text_transform != text_transform_none)
{
m_transformed_text = m_text;
m_use_transformed = true;
get_document()->container()->transform_text(m_transformed_text, m_text_transform);
}
if(is_white_space())
{
m_transformed_text = _t(" ");
m_use_transformed = true;
} else
{
if(m_text == _t("\t"))
{
m_transformed_text = _t(" ");
m_use_transformed = true;
}
if(m_text == _t("\n") || m_text == _t("\r"))
{
m_transformed_text = _t("");
m_use_transformed = true;
}
}
font_metrics fm;
uint_ptr font = 0;
element::ptr el_parent = parent();
if (el_parent)
{
font = el_parent->get_font(&fm);
}
if(is_break())
{
m_size.height = 0;
m_size.width = 0;
} else
{
m_size.height = fm.height;
m_size.width = get_document()->container()->text_width(m_use_transformed ? m_transformed_text.c_str() : m_text.c_str(), font);
}
m_draw_spaces = fm.draw_spaces;
}
int litehtml::el_text::get_base_line()
{
element::ptr el_parent = parent();
if (el_parent)
{
return el_parent->get_base_line();
}
return 0;
}
void litehtml::el_text::draw( uint_ptr hdc, int x, int y, const position* clip )
{
if(is_white_space() && !m_draw_spaces)
{
return;
}
position pos = m_pos;
pos.x += x;
pos.y += y;
if(pos.does_intersect(clip))
{
element::ptr el_parent = parent();
if (el_parent)
{
document::ptr doc = get_document();
uint_ptr font = el_parent->get_font();
litehtml::web_color color = el_parent->get_color(_t("color"), true, doc->get_def_color());
doc->container()->draw_text(hdc, m_use_transformed ? m_transformed_text.c_str() : m_text.c_str(), font, color, pos);
}
}
}
int litehtml::el_text::line_height() const
{
element::ptr el_parent = parent();
if (el_parent)
{
return el_parent->line_height();
}
return 0;
}
litehtml::uint_ptr litehtml::el_text::get_font( font_metrics* fm /*= 0*/ )
{
element::ptr el_parent = parent();
if (el_parent)
{
return el_parent->get_font(fm);
}
return 0;
}
litehtml::style_display litehtml::el_text::get_display() const
{
return display_inline_text;
}
litehtml::white_space litehtml::el_text::get_white_space() const
{
element::ptr el_parent = parent();
if (el_parent) return el_parent->get_white_space();
return white_space_normal;
}
litehtml::element_position litehtml::el_text::get_element_position(css_offsets* offsets) const
{
element::ptr p = parent();
while(p && p->get_display() == display_inline)
{
if(p->get_element_position() == element_position_relative)
{
if(offsets)
{
*offsets = p->get_css_offsets();
}
return element_position_relative;
}
p = p->parent();
}
return element_position_static;
}
litehtml::css_offsets litehtml::el_text::get_css_offsets() const
{
element::ptr p = parent();
while(p && p->get_display() == display_inline)
{
if(p->get_element_position() == element_position_relative)
{
return p->get_css_offsets();
}
p = p->parent();
}
return {};
}

View File

@ -0,0 +1,15 @@
#include "html.h"
#include "el_title.h"
#include "document.h"
litehtml::el_title::el_title(const std::shared_ptr<litehtml::document>& doc) : litehtml::html_tag(doc)
{
}
void litehtml::el_title::parse_attributes()
{
tstring text;
get_text(text);
get_document()->container()->set_caption(text.c_str());
}

View File

@ -0,0 +1,46 @@
#include "html.h"
#include "el_tr.h"
litehtml::el_tr::el_tr(const std::shared_ptr<litehtml::document>& doc) : html_tag(doc)
{
}
void litehtml::el_tr::parse_attributes()
{
const tchar_t* str = get_attr(_t("align"));
if(str)
{
m_style.add_property(_t("text-align"), str, nullptr, false, this);
}
str = get_attr(_t("valign"));
if(str)
{
m_style.add_property(_t("vertical-align"), str, nullptr, false, this);
}
str = get_attr(_t("bgcolor"));
if (str)
{
m_style.add_property(_t("background-color"), str, nullptr, false, this);
}
html_tag::parse_attributes();
}
void litehtml::el_tr::get_inline_boxes( position::vector& boxes )
{
position pos;
for(auto& el : m_children)
{
if(el->get_display() == display_table_cell)
{
pos.x = el->left() + el->margin_left();
pos.y = el->top() - m_padding.top - m_borders.top;
pos.width = el->right() - pos.x - el->margin_right() - el->margin_left();
pos.height = el->height() + m_padding.top + m_padding.bottom + m_borders.top + m_borders.bottom;
boxes.push_back(pos);
}
}
}

View File

@ -0,0 +1,411 @@
#include "html.h"
#include "element.h"
#include "document.h"
#define LITEHTML_EMPTY_FUNC {}
#define LITEHTML_RETURN_FUNC(ret) {return ret;}
litehtml::element::element(const std::shared_ptr<litehtml::document>& doc) : m_doc(doc)
{
m_box = nullptr;
m_skip = false;
}
bool litehtml::element::is_point_inside( int x, int y )
{
if(get_display() != display_inline && get_display() != display_table_row)
{
position pos = m_pos;
pos += m_padding;
pos += m_borders;
if(pos.is_point_inside(x, y))
{
return true;
} else
{
return false;
}
} else
{
position::vector boxes;
get_inline_boxes(boxes);
for(auto & box : boxes)
{
if(box.is_point_inside(x, y))
{
return true;
}
}
}
return false;
}
litehtml::web_color litehtml::element::get_color( const tchar_t* prop_name, bool inherited, const litehtml::web_color& def_color )
{
const tchar_t* clrstr = get_style_property(prop_name, inherited, nullptr);
if(!clrstr)
{
return def_color;
}
return web_color::from_string(clrstr, get_document()->container());
}
litehtml::position litehtml::element::get_placement() const
{
litehtml::position pos = m_pos;
element::ptr cur_el = parent();
while(cur_el)
{
pos.x += cur_el->m_pos.x;
pos.y += cur_el->m_pos.y;
cur_el = cur_el->parent();
}
return pos;
}
bool litehtml::element::is_inline_box() const
{
style_display d = get_display();
if( d == display_inline ||
d == display_inline_table ||
d == display_inline_block ||
d == display_inline_text)
{
return true;
}
return false;
}
bool litehtml::element::collapse_top_margin() const
{
if(!m_borders.top && !m_padding.top && in_normal_flow() && get_float() == float_none && m_margins.top >= 0 && have_parent())
{
return true;
}
return false;
}
bool litehtml::element::collapse_bottom_margin() const
{
if(!m_borders.bottom && !m_padding.bottom && in_normal_flow() && get_float() == float_none && m_margins.bottom >= 0 && have_parent())
{
return true;
}
return false;
}
bool litehtml::element::get_predefined_height(int& p_height) const
{
css_length h = get_css_height();
if(h.is_predefined())
{
p_height = m_pos.height;
return false;
}
if(h.units() == css_units_percentage)
{
element::ptr el_parent = parent();
if (!el_parent)
{
position client_pos;
get_document()->container()->get_client_rect(client_pos);
p_height = h.calc_percent(client_pos.height);
return true;
} else
{
int ph = 0;
if (el_parent->get_predefined_height(ph))
{
p_height = h.calc_percent(ph);
if (is_body())
{
p_height -= content_margins_height();
}
return true;
} else
{
p_height = m_pos.height;
return false;
}
}
}
p_height = get_document()->cvt_units(h, get_font_size());
return true;
}
void litehtml::element::calc_document_size( litehtml::size& sz, int x /*= 0*/, int y /*= 0*/ )
{
if(is_visible())
{
sz.width = std::max(sz.width, x + right());
sz.height = std::max(sz.height, y + bottom());
}
}
void litehtml::element::get_redraw_box(litehtml::position& pos, int x /*= 0*/, int y /*= 0*/)
{
if(is_visible())
{
int p_left = std::min(pos.left(), x + m_pos.left() - m_padding.left - m_borders.left);
int p_right = std::max(pos.right(), x + m_pos.right() + m_padding.left + m_borders.left);
int p_top = std::min(pos.top(), y + m_pos.top() - m_padding.top - m_borders.top);
int p_bottom = std::max(pos.bottom(), y + m_pos.bottom() + m_padding.bottom + m_borders.bottom);
pos.x = p_left;
pos.y = p_top;
pos.width = p_right - p_left;
pos.height = p_bottom - p_top;
}
}
int litehtml::element::calc_width(int defVal) const
{
css_length w = get_css_width();
if(w.is_predefined() || get_display() == display_table_cell)
{
return defVal;
}
if(w.units() == css_units_percentage)
{
element::ptr el_parent = parent();
if (!el_parent)
{
position client_pos;
get_document()->container()->get_client_rect(client_pos);
return w.calc_percent(client_pos.width) - content_margins_width();
} else
{
int pw = el_parent->calc_width(defVal);
if (is_body())
{
pw -= content_margins_width();
}
return w.calc_percent(pw);
}
}
return get_document()->cvt_units(w, get_font_size());
}
bool litehtml::element::is_ancestor(const ptr &el) const
{
element::ptr el_parent = parent();
while(el_parent && el_parent != el)
{
el_parent = el_parent->parent();
}
if(el_parent)
{
return true;
}
return false;
}
int litehtml::element::get_inline_shift_left()
{
int ret = 0;
element::ptr el_parent = parent();
if (el_parent)
{
if (el_parent->get_display() == display_inline)
{
style_display disp = get_display();
if (disp == display_inline_text || disp == display_inline_block)
{
element::ptr el = shared_from_this();
while (el_parent && el_parent->get_display() == display_inline)
{
if (el_parent->is_first_child_inline(el))
{
ret += el_parent->padding_left() + el_parent->border_left() + el_parent->margin_left();
}
el = el_parent;
el_parent = el_parent->parent();
}
}
}
}
return ret;
}
int litehtml::element::get_inline_shift_right()
{
int ret = 0;
element::ptr el_parent = parent();
if (el_parent)
{
if (el_parent->get_display() == display_inline)
{
style_display disp = get_display();
if (disp == display_inline_text || disp == display_inline_block)
{
element::ptr el = shared_from_this();
while (el_parent && el_parent->get_display() == display_inline)
{
if (el_parent->is_last_child_inline(el))
{
ret += el_parent->padding_right() + el_parent->border_right() + el_parent->margin_right();
}
el = el_parent;
el_parent = el_parent->parent();
}
}
}
}
return ret;
}
void litehtml::element::apply_relative_shift(int parent_width)
{
css_offsets offsets;
if (get_element_position(&offsets) == element_position_relative)
{
element::ptr parent_ptr = parent();
if (!offsets.left.is_predefined())
{
m_pos.x += offsets.left.calc_percent(parent_width);
}
else if (!offsets.right.is_predefined())
{
m_pos.x -= offsets.right.calc_percent(parent_width);
}
if (!offsets.top.is_predefined())
{
int h = 0;
if (offsets.top.units() == css_units_percentage)
{
element::ptr el_parent = parent();
if (el_parent)
{
el_parent->get_predefined_height(h);
}
}
m_pos.y += offsets.top.calc_percent(h);
}
else if (!offsets.bottom.is_predefined())
{
int h = 0;
if (offsets.top.units() == css_units_percentage)
{
element::ptr el_parent = parent();
if (el_parent)
{
el_parent->get_predefined_height(h);
}
}
m_pos.y -= offsets.bottom.calc_percent(h);
}
}
}
bool litehtml::element::is_table_skip() const
{
return is_space() || is_comment() || get_display() == display_none;
}
void litehtml::element::calc_auto_margins(int parent_width) LITEHTML_EMPTY_FUNC
const litehtml::background* litehtml::element::get_background(bool own_only) LITEHTML_RETURN_FUNC(nullptr)
litehtml::element::ptr litehtml::element::get_element_by_point(int x, int y, int client_x, int client_y) LITEHTML_RETURN_FUNC(nullptr)
litehtml::element::ptr litehtml::element::get_child_by_point(int x, int y, int client_x, int client_y, draw_flag flag, int zindex) LITEHTML_RETURN_FUNC(nullptr)
void litehtml::element::get_line_left_right( int y, int def_right, int& ln_left, int& ln_right ) LITEHTML_EMPTY_FUNC
void litehtml::element::add_style( const tstring& style, const tstring& baseurl ) LITEHTML_EMPTY_FUNC
void litehtml::element::select_all(const css_selector& selector, litehtml::elements_vector& res) LITEHTML_EMPTY_FUNC
litehtml::elements_vector litehtml::element::select_all(const litehtml::css_selector& selector) LITEHTML_RETURN_FUNC(litehtml::elements_vector())
litehtml::elements_vector litehtml::element::select_all(const litehtml::tstring& selector) LITEHTML_RETURN_FUNC(litehtml::elements_vector())
litehtml::element::ptr litehtml::element::select_one( const css_selector& selector ) LITEHTML_RETURN_FUNC(nullptr)
litehtml::element::ptr litehtml::element::select_one( const tstring& selector ) LITEHTML_RETURN_FUNC(nullptr)
litehtml::element::ptr litehtml::element::find_adjacent_sibling(const element::ptr& el, const css_selector& selector, bool apply_pseudo /*= true*/, bool* is_pseudo /*= 0*/) LITEHTML_RETURN_FUNC(nullptr)
litehtml::element::ptr litehtml::element::find_sibling(const element::ptr& el, const css_selector& selector, bool apply_pseudo /*= true*/, bool* is_pseudo /*= 0*/) LITEHTML_RETURN_FUNC(nullptr)
bool litehtml::element::is_nth_last_child(const element::ptr& el, int num, int off, bool of_type) const LITEHTML_RETURN_FUNC(false)
bool litehtml::element::is_nth_child(const element::ptr&, int num, int off, bool of_type) const LITEHTML_RETURN_FUNC(false)
bool litehtml::element::is_only_child(const element::ptr& el, bool of_type) const LITEHTML_RETURN_FUNC(false)
litehtml::overflow litehtml::element::get_overflow() const LITEHTML_RETURN_FUNC(overflow_visible)
void litehtml::element::draw_children( uint_ptr hdc, int x, int y, const position* clip, draw_flag flag, int zindex ) LITEHTML_EMPTY_FUNC
void litehtml::element::draw_stacking_context( uint_ptr hdc, int x, int y, const position* clip, bool with_positioned ) LITEHTML_EMPTY_FUNC
void litehtml::element::render_positioned(render_type rt) LITEHTML_EMPTY_FUNC
int litehtml::element::get_zindex() const LITEHTML_RETURN_FUNC(0)
bool litehtml::element::fetch_positioned() LITEHTML_RETURN_FUNC(false)
litehtml::visibility litehtml::element::get_visibility() const LITEHTML_RETURN_FUNC(visibility_visible)
void litehtml::element::apply_vertical_align() LITEHTML_EMPTY_FUNC
void litehtml::element::set_css_width( css_length& w ) LITEHTML_EMPTY_FUNC
litehtml::element::ptr litehtml::element::get_child( int idx ) const LITEHTML_RETURN_FUNC(nullptr)
size_t litehtml::element::get_children_count() const LITEHTML_RETURN_FUNC(0)
void litehtml::element::calc_outlines( int parent_width ) LITEHTML_EMPTY_FUNC
litehtml::css_length litehtml::element::get_css_width() const LITEHTML_RETURN_FUNC(css_length())
litehtml::css_length litehtml::element::get_css_height() const LITEHTML_RETURN_FUNC(css_length())
litehtml::element_clear litehtml::element::get_clear() const LITEHTML_RETURN_FUNC(clear_none)
litehtml::css_length litehtml::element::get_css_left() const LITEHTML_RETURN_FUNC(css_length())
litehtml::css_length litehtml::element::get_css_right() const LITEHTML_RETURN_FUNC(css_length())
litehtml::css_length litehtml::element::get_css_top() const LITEHTML_RETURN_FUNC(css_length())
litehtml::css_length litehtml::element::get_css_bottom() const LITEHTML_RETURN_FUNC(css_length())
litehtml::css_offsets litehtml::element::get_css_offsets() const LITEHTML_RETURN_FUNC(css_offsets())
litehtml::vertical_align litehtml::element::get_vertical_align() const LITEHTML_RETURN_FUNC(va_baseline)
int litehtml::element::place_element(const ptr &el, int max_width) LITEHTML_RETURN_FUNC(0)
int litehtml::element::render_inline(const ptr &container, int max_width) LITEHTML_RETURN_FUNC(0)
void litehtml::element::add_positioned(const ptr &el) LITEHTML_EMPTY_FUNC
int litehtml::element::find_next_line_top( int top, int width, int def_right ) LITEHTML_RETURN_FUNC(0)
litehtml::element_float litehtml::element::get_float() const LITEHTML_RETURN_FUNC(float_none)
void litehtml::element::add_float(const ptr &el, int x, int y) LITEHTML_EMPTY_FUNC
void litehtml::element::update_floats(int dy, const ptr &parent) LITEHTML_EMPTY_FUNC
int litehtml::element::get_line_left( int y ) LITEHTML_RETURN_FUNC(0)
int litehtml::element::get_line_right( int y, int def_right ) LITEHTML_RETURN_FUNC(def_right)
int litehtml::element::get_left_floats_height() const LITEHTML_RETURN_FUNC(0)
int litehtml::element::get_right_floats_height() const LITEHTML_RETURN_FUNC(0)
int litehtml::element::get_floats_height(element_float el_float) const LITEHTML_RETURN_FUNC(0)
bool litehtml::element::is_floats_holder() const LITEHTML_RETURN_FUNC(false)
void litehtml::element::get_content_size( size& sz, int max_width ) LITEHTML_EMPTY_FUNC
void litehtml::element::init() LITEHTML_EMPTY_FUNC
int litehtml::element::render( int x, int y, int max_width, bool second_pass ) LITEHTML_RETURN_FUNC(0)
bool litehtml::element::appendChild(const ptr &el) LITEHTML_RETURN_FUNC(false)
bool litehtml::element::removeChild(const ptr &el) LITEHTML_RETURN_FUNC(false)
void litehtml::element::clearRecursive() LITEHTML_EMPTY_FUNC
const litehtml::tchar_t* litehtml::element::get_tagName() const LITEHTML_RETURN_FUNC(_t(""))
void litehtml::element::set_tagName( const tchar_t* tag ) LITEHTML_EMPTY_FUNC
void litehtml::element::set_data( const tchar_t* data ) LITEHTML_EMPTY_FUNC
void litehtml::element::set_attr( const tchar_t* name, const tchar_t* val ) LITEHTML_EMPTY_FUNC
void litehtml::element::apply_stylesheet( const litehtml::css& stylesheet ) LITEHTML_EMPTY_FUNC
void litehtml::element::refresh_styles() LITEHTML_EMPTY_FUNC
void litehtml::element::on_click() LITEHTML_EMPTY_FUNC
void litehtml::element::init_font() LITEHTML_EMPTY_FUNC
void litehtml::element::get_inline_boxes( position::vector& boxes ) LITEHTML_EMPTY_FUNC
void litehtml::element::parse_styles( bool is_reparse /*= false*/ ) LITEHTML_EMPTY_FUNC
const litehtml::tchar_t* litehtml::element::get_attr( const tchar_t* name, const tchar_t* def /*= 0*/ ) const LITEHTML_RETURN_FUNC(def)
bool litehtml::element::is_white_space() const LITEHTML_RETURN_FUNC(false)
bool litehtml::element::is_space() const LITEHTML_RETURN_FUNC(false)
bool litehtml::element::is_comment() const LITEHTML_RETURN_FUNC(false)
bool litehtml::element::is_body() const LITEHTML_RETURN_FUNC(false)
bool litehtml::element::is_break() const LITEHTML_RETURN_FUNC(false)
int litehtml::element::get_base_line() LITEHTML_RETURN_FUNC(0)
bool litehtml::element::on_mouse_over() LITEHTML_RETURN_FUNC(false)
bool litehtml::element::on_mouse_leave() LITEHTML_RETURN_FUNC(false)
bool litehtml::element::on_lbutton_down() LITEHTML_RETURN_FUNC(false)
bool litehtml::element::on_lbutton_up() LITEHTML_RETURN_FUNC(false)
bool litehtml::element::find_styles_changes( position::vector& redraw_boxes, int x, int y ) LITEHTML_RETURN_FUNC(false)
const litehtml::tchar_t* litehtml::element::get_cursor() LITEHTML_RETURN_FUNC(nullptr)
litehtml::white_space litehtml::element::get_white_space() const LITEHTML_RETURN_FUNC(white_space_normal)
litehtml::style_display litehtml::element::get_display() const LITEHTML_RETURN_FUNC(display_none)
bool litehtml::element::set_pseudo_class( const tchar_t* pclass, bool add ) LITEHTML_RETURN_FUNC(false)
bool litehtml::element::set_class( const tchar_t* pclass, bool add ) LITEHTML_RETURN_FUNC(false)
litehtml::element_position litehtml::element::get_element_position(css_offsets* offsets) const LITEHTML_RETURN_FUNC(element_position_static)
bool litehtml::element::is_replaced() const LITEHTML_RETURN_FUNC(false)
int litehtml::element::line_height() const LITEHTML_RETURN_FUNC(0)
void litehtml::element::draw( uint_ptr hdc, int x, int y, const position* clip ) LITEHTML_EMPTY_FUNC
void litehtml::element::draw_background( uint_ptr hdc, int x, int y, const position* clip ) LITEHTML_EMPTY_FUNC
const litehtml::tchar_t* litehtml::element::get_style_property( const tchar_t* name, bool inherited, const tchar_t* def /*= 0*/ ) const LITEHTML_RETURN_FUNC(nullptr)
litehtml::uint_ptr litehtml::element::get_font( font_metrics* fm /*= 0*/ ) LITEHTML_RETURN_FUNC(0)
int litehtml::element::get_font_size() const LITEHTML_RETURN_FUNC(0)
void litehtml::element::get_text( tstring& text ) LITEHTML_EMPTY_FUNC
void litehtml::element::parse_attributes() LITEHTML_EMPTY_FUNC
int litehtml::element::select( const css_selector& selector, bool apply_pseudo) LITEHTML_RETURN_FUNC(select_no_match)
int litehtml::element::select( const css_element_selector& selector, bool apply_pseudo /*= true*/ ) LITEHTML_RETURN_FUNC(select_no_match)
litehtml::element::ptr litehtml::element::find_ancestor(const css_selector& selector, bool apply_pseudo, bool* is_pseudo) LITEHTML_RETURN_FUNC(nullptr)
bool litehtml::element::is_first_child_inline(const element::ptr& el) const LITEHTML_RETURN_FUNC(false)
bool litehtml::element::is_last_child_inline(const element::ptr& el) LITEHTML_RETURN_FUNC(false)
bool litehtml::element::have_inline_child() const LITEHTML_RETURN_FUNC(false)

View File

@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -0,0 +1,44 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
#include "attribute.h"
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <strings.h>
#include "util.h"
struct GumboInternalParser;
GumboAttribute* gumbo_get_attribute(
const GumboVector* attributes, const char* name) {
for (unsigned int i = 0; i < attributes->length; ++i) {
GumboAttribute* attr = attributes->data[i];
if (!strcasecmp(attr->name, name)) {
return attr;
}
}
return NULL;
}
void gumbo_destroy_attribute(
struct GumboInternalParser* parser, GumboAttribute* attribute) {
gumbo_parser_deallocate(parser, (void*) attribute->name);
gumbo_parser_deallocate(parser, (void*) attribute->value);
gumbo_parser_deallocate(parser, (void*) attribute);
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,279 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
#include "error.h"
#include <assert.h>
#include <stdarg.h>
#include <stdio.h>
#include <string.h>
#include "gumbo.h"
#include "parser.h"
#include "string_buffer.h"
#include "util.h"
#include "vector.h"
// Prints a formatted message to a StringBuffer. This automatically resizes the
// StringBuffer as necessary to fit the message. Returns the number of bytes
// written.
static int print_message(
GumboParser* parser, GumboStringBuffer* output, const char* format, ...) {
va_list args;
size_t remaining_capacity = output->capacity - output->length;
va_start(args, format);
int bytes_written = vsnprintf(
output->data + output->length, remaining_capacity, format, args);
va_end(args);
#ifdef _MSC_VER
if (bytes_written == -1) {
// vsnprintf returns -1 on MSVC++ if there's not enough capacity, instead of
// returning the number of bytes that would've been written had there been
// enough. In this case, we'll double the buffer size and hope it fits when
// we retry (letting it fail and returning 0 if it doesn't), since there's
// no way to smartly resize the buffer.
gumbo_string_buffer_reserve(parser, output->capacity * 2, output);
va_start(args, format);
int result = vsnprintf(
output->data + output->length, remaining_capacity, format, args);
va_end(args);
return result == -1 ? 0 : result;
}
#else
// -1 in standard C99 indicates an encoding error. Return 0 and do nothing.
if (bytes_written == -1) {
return 0;
}
#endif
if (bytes_written > remaining_capacity) {
gumbo_string_buffer_reserve(
parser, output->capacity + bytes_written, output);
remaining_capacity = output->capacity - output->length;
va_start(args, format);
bytes_written = vsnprintf(
output->data + output->length, remaining_capacity, format, args);
va_end(args);
}
output->length += bytes_written;
return bytes_written;
}
static void print_tag_stack(GumboParser* parser, const GumboParserError* error,
GumboStringBuffer* output) {
print_message(parser, output, " Currently open tags: ");
for (unsigned int i = 0; i < error->tag_stack.length; ++i) {
if (i) {
print_message(parser, output, ", ");
}
GumboTag tag = (GumboTag) error->tag_stack.data[i];
print_message(parser, output, gumbo_normalized_tagname(tag));
}
gumbo_string_buffer_append_codepoint(parser, '.', output);
}
static void handle_parser_error(GumboParser* parser,
const GumboParserError* error, GumboStringBuffer* output) {
if (error->parser_state == GUMBO_INSERTION_MODE_INITIAL &&
error->input_type != GUMBO_TOKEN_DOCTYPE) {
print_message(
parser, output, "The doctype must be the first token in the document");
return;
}
switch (error->input_type) {
case GUMBO_TOKEN_DOCTYPE:
print_message(parser, output, "This is not a legal doctype");
return;
case GUMBO_TOKEN_COMMENT:
// Should never happen; comments are always legal.
assert(0);
// But just in case...
print_message(parser, output, "Comments aren't legal here");
return;
case GUMBO_TOKEN_CDATA:
case GUMBO_TOKEN_WHITESPACE:
case GUMBO_TOKEN_CHARACTER:
print_message(parser, output, "Character tokens aren't legal here");
return;
case GUMBO_TOKEN_NULL:
print_message(parser, output, "Null bytes are not allowed in HTML5");
return;
case GUMBO_TOKEN_EOF:
if (error->parser_state == GUMBO_INSERTION_MODE_INITIAL) {
print_message(parser, output, "You must provide a doctype");
} else {
print_message(parser, output, "Premature end of file");
print_tag_stack(parser, error, output);
}
return;
case GUMBO_TOKEN_START_TAG:
case GUMBO_TOKEN_END_TAG:
print_message(parser, output, "That tag isn't allowed here");
print_tag_stack(parser, error, output);
// TODO(jdtang): Give more specific messaging.
return;
}
}
// Finds the preceding newline in an original source buffer from a given byte
// location. Returns a character pointer to the character after that, or a
// pointer to the beginning of the string if this is the first line.
static const char* find_last_newline(
const char* original_text, const char* error_location) {
assert(error_location >= original_text);
const char* c = error_location;
for (; c != original_text && *c != '\n'; --c) {
// There may be an error at EOF, which would be a nul byte.
assert(*c || c == error_location);
}
return c == original_text ? c : c + 1;
}
// Finds the next newline in the original source buffer from a given byte
// location. Returns a character pointer to that newline, or a pointer to the
// terminating null byte if this is the last line.
static const char* find_next_newline(
const char* original_text, const char* error_location) {
const char* c = error_location;
for (; *c && *c != '\n'; ++c)
;
return c;
}
GumboError* gumbo_add_error(GumboParser* parser) {
int max_errors = parser->_options->max_errors;
if (max_errors >= 0 && parser->_output->errors.length >= (unsigned int) max_errors) {
return NULL;
}
GumboError* error = gumbo_parser_allocate(parser, sizeof(GumboError));
gumbo_vector_add(parser, error, &parser->_output->errors);
return error;
}
void gumbo_error_to_string(
GumboParser* parser, const GumboError* error, GumboStringBuffer* output) {
print_message(
parser, output, "@%d:%d: ", error->position.line, error->position.column);
switch (error->type) {
case GUMBO_ERR_UTF8_INVALID:
print_message(
parser, output, "Invalid UTF8 character 0x%x", error->v.codepoint);
break;
case GUMBO_ERR_UTF8_TRUNCATED:
print_message(parser, output,
"Input stream ends with a truncated UTF8 character 0x%x",
error->v.codepoint);
break;
case GUMBO_ERR_NUMERIC_CHAR_REF_NO_DIGITS:
print_message(
parser, output, "No digits after &# in numeric character reference");
break;
case GUMBO_ERR_NUMERIC_CHAR_REF_WITHOUT_SEMICOLON:
print_message(parser, output,
"The numeric character reference &#%d should be followed "
"by a semicolon",
error->v.codepoint);
break;
case GUMBO_ERR_NUMERIC_CHAR_REF_INVALID:
print_message(parser, output,
"The numeric character reference &#%d; encodes an invalid "
"unicode codepoint",
error->v.codepoint);
break;
case GUMBO_ERR_NAMED_CHAR_REF_WITHOUT_SEMICOLON:
// The textual data came from one of the literal strings in the table, and
// so it'll be null-terminated.
print_message(parser, output,
"The named character reference &%.*s should be followed by a "
"semicolon",
(int) error->v.text.length, error->v.text.data);
break;
case GUMBO_ERR_NAMED_CHAR_REF_INVALID:
print_message(parser, output,
"The named character reference &%.*s; is not a valid entity name",
(int) error->v.text.length, error->v.text.data);
break;
case GUMBO_ERR_DUPLICATE_ATTR:
print_message(parser, output,
"Attribute %s occurs multiple times, at positions %d and %d",
error->v.duplicate_attr.name, error->v.duplicate_attr.original_index,
error->v.duplicate_attr.new_index);
break;
case GUMBO_ERR_PARSER:
case GUMBO_ERR_UNACKNOWLEDGED_SELF_CLOSING_TAG:
handle_parser_error(parser, &error->v.parser, output);
break;
default:
print_message(parser, output,
"Tokenizer error with an unimplemented error message");
break;
}
gumbo_string_buffer_append_codepoint(parser, '.', output);
}
void gumbo_caret_diagnostic_to_string(GumboParser* parser,
const GumboError* error, const char* source_text,
GumboStringBuffer* output) {
gumbo_error_to_string(parser, error, output);
const char* line_start = find_last_newline(source_text, error->original_text);
const char* line_end = find_next_newline(source_text, error->original_text);
GumboStringPiece original_line;
original_line.data = line_start;
original_line.length = line_end - line_start;
gumbo_string_buffer_append_codepoint(parser, '\n', output);
gumbo_string_buffer_append_string(parser, &original_line, output);
gumbo_string_buffer_append_codepoint(parser, '\n', output);
gumbo_string_buffer_reserve(
parser, output->length + error->position.column, output);
int num_spaces = error->position.column - 1;
memset(output->data + output->length, ' ', num_spaces);
output->length += num_spaces;
gumbo_string_buffer_append_codepoint(parser, '^', output);
gumbo_string_buffer_append_codepoint(parser, '\n', output);
}
void gumbo_print_caret_diagnostic(
GumboParser* parser, const GumboError* error, const char* source_text) {
GumboStringBuffer text;
gumbo_string_buffer_init(parser, &text);
gumbo_caret_diagnostic_to_string(parser, error, source_text, &text);
printf("%.*s", (int) text.length, text.data);
gumbo_string_buffer_destroy(parser, &text);
}
void gumbo_error_destroy(GumboParser* parser, GumboError* error) {
if (error->type == GUMBO_ERR_PARSER ||
error->type == GUMBO_ERR_UNACKNOWLEDGED_SELF_CLOSING_TAG) {
gumbo_vector_destroy(parser, &error->v.parser.tag_stack);
} else if (error->type == GUMBO_ERR_DUPLICATE_ATTR) {
gumbo_parser_deallocate(parser, (void*) error->v.duplicate_attr.name);
}
gumbo_parser_deallocate(parser, error);
}
void gumbo_init_errors(GumboParser* parser) {
gumbo_vector_init(parser, 5, &parser->_output->errors);
}
void gumbo_destroy_errors(GumboParser* parser) {
for (unsigned int i = 0; i < parser->_output->errors.length; ++i) {
gumbo_error_destroy(parser, parser->_output->errors.data[i]);
}
gumbo_vector_destroy(parser, &parser->_output->errors);
}

View File

@ -0,0 +1,671 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
//
// We use Gumbo as a prefix for types, gumbo_ as a prefix for functions, and
// GUMBO_ as a prefix for enum constants (static constants get the Google-style
// kGumbo prefix).
/**
* @file
* @mainpage Gumbo HTML Parser
*
* This provides a conformant, no-dependencies implementation of the HTML5
* parsing algorithm. It supports only UTF8; if you need to parse a different
* encoding, run a preprocessing step to convert to UTF8. It returns a parse
* tree made of the structs in this file.
*
* Example:
* @code
* GumboOutput* output = gumbo_parse(input);
* do_something_with_doctype(output->document);
* do_something_with_html_tree(output->root);
* gumbo_destroy_output(&options, output);
* @endcode
* HTML5 Spec:
*
* http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html
*/
#ifndef GUMBO_GUMBO_H_
#define GUMBO_GUMBO_H_
#ifdef _MSC_VER
#define _CRT_SECURE_NO_WARNINGS
#define fileno _fileno
#endif
#include <stdbool.h>
#include <stddef.h>
#ifdef __cplusplus
extern "C" {
#endif
/**
* A struct representing a character position within the original text buffer.
* Line and column numbers are 1-based and offsets are 0-based, which matches
* how most editors and command-line tools work. Also, columns measure
* positions in terms of characters while offsets measure by bytes; this is
* because the offset field is often used to pull out a particular region of
* text (which in most languages that bind to C implies pointer arithmetic on a
* buffer of bytes), while the column field is often used to reference a
* particular column on a printable display, which nowadays is usually UTF-8.
*/
typedef struct {
unsigned int line;
unsigned int column;
unsigned int offset;
} GumboSourcePosition;
/**
* A SourcePosition used for elements that have no source position, i.e.
* parser-inserted elements.
*/
extern const GumboSourcePosition kGumboEmptySourcePosition;
/**
* A struct representing a string or part of a string. Strings within the
* parser are represented by a char* and a length; the char* points into
* an existing data buffer owned by some other code (often the original input).
* GumboStringPieces are assumed (by convention) to be immutable, because they
* may share data. Use GumboStringBuffer if you need to construct a string.
* Clients should assume that it is not NUL-terminated, and should always use
* explicit lengths when manipulating them.
*/
typedef struct {
/** A pointer to the beginning of the string. NULL iff length == 0. */
const char* data;
/** The length of the string fragment, in bytes. May be zero. */
size_t length;
} GumboStringPiece;
/** A constant to represent a 0-length null string. */
extern const GumboStringPiece kGumboEmptyString;
/**
* Compares two GumboStringPieces, and returns true if they're equal or false
* otherwise.
*/
bool gumbo_string_equals(
const GumboStringPiece* str1, const GumboStringPiece* str2);
/**
* Compares two GumboStringPieces ignoring case, and returns true if they're
* equal or false otherwise.
*/
bool gumbo_string_equals_ignore_case(
const GumboStringPiece* str1, const GumboStringPiece* str2);
/**
* A simple vector implementation. This stores a pointer to a data array and a
* length. All elements are stored as void*; client code must cast to the
* appropriate type. Overflows upon addition result in reallocation of the data
* array, with the size doubling to maintain O(1) amortized cost. There is no
* removal function, as this isn't needed for any of the operations within this
* library. Iteration can be done through inspecting the structure directly in
* a for-loop.
*/
typedef struct {
/** Data elements. This points to a dynamically-allocated array of capacity
* elements, each a void* to the element itself.
*/
void** data;
/** Number of elements currently in the vector. */
unsigned int length;
/** Current array capacity. */
unsigned int capacity;
} GumboVector;
/** An empty (0-length, 0-capacity) GumboVector. */
extern const GumboVector kGumboEmptyVector;
/**
* Returns the first index at which an element appears in this vector (testing
* by pointer equality), or -1 if it never does.
*/
int gumbo_vector_index_of(GumboVector* vector, const void* element);
/**
* An enum for all the tags defined in the HTML5 standard. These correspond to
* the tag names themselves. Enum constants exist only for tags which appear in
* the spec itself (or for tags with special handling in the SVG and MathML
* namespaces); any other tags appear as GUMBO_TAG_UNKNOWN and the actual tag
* name can be obtained through original_tag.
*
* This is mostly for API convenience, so that clients of this library don't
* need to perform a strcasecmp to find the normalized tag name. It also has
* efficiency benefits, by letting the parser work with enums instead of
* strings.
*/
typedef enum {
// Load all the tags from an external source, generated from tag.in.
#include "gumbo/tag_enum.h"
// Used for all tags that don't have special handling in HTML. Add new tags
// to the end of tag.in so as to preserve backwards-compatibility.
GUMBO_TAG_UNKNOWN,
// A marker value to indicate the end of the enum, for iterating over it.
// Also used as the terminator for varargs functions that take tags.
GUMBO_TAG_LAST,
} GumboTag;
/**
* Returns the normalized (usually all-lowercased, except for foreign content)
* tag name for an GumboTag enum. Return value is static data owned by the
* library.
*/
const char* gumbo_normalized_tagname(GumboTag tag);
/**
* Extracts the tag name from the original_text field of an element or token by
* stripping off </> characters and attributes and adjusting the passed-in
* GumboStringPiece appropriately. The tag name is in the original case and
* shares a buffer with the original text, to simplify memory management.
* Behavior is undefined if a string-piece that doesn't represent an HTML tag
* (<tagname> or </tagname>) is passed in. If the string piece is completely
* empty (NULL data pointer), then this function will exit successfully as a
* no-op.
*/
void gumbo_tag_from_original_text(GumboStringPiece* text);
/**
* Fixes the case of SVG elements that are not all lowercase.
* http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#parsing-main-inforeign
* This is not done at parse time because there's no place to store a mutated
* tag name. tag_name is an enum (which will be TAG_UNKNOWN for most SVG tags
* without special handling), while original_tag_name is a pointer into the
* original buffer. Instead, we provide this helper function that clients can
* use to rename SVG tags as appropriate.
* Returns the case-normalized SVG tagname if a replacement is found, or NULL if
* no normalization is called for. The return value is static data and owned by
* the library.
*/
const char* gumbo_normalize_svg_tagname(const GumboStringPiece* tagname);
/**
* Converts a tag name string (which may be in upper or mixed case) to a tag
* enum. The `tag` version expects `tagname` to be NULL-terminated
*/
GumboTag gumbo_tag_enum(const char* tagname);
GumboTag gumbo_tagn_enum(const char* tagname, unsigned int length);
/**
* Attribute namespaces.
* HTML includes special handling for XLink, XML, and XMLNS namespaces on
* attributes. Everything else goes in the generic "NONE" namespace.
*/
typedef enum {
GUMBO_ATTR_NAMESPACE_NONE,
GUMBO_ATTR_NAMESPACE_XLINK,
GUMBO_ATTR_NAMESPACE_XML,
GUMBO_ATTR_NAMESPACE_XMLNS,
} GumboAttributeNamespaceEnum;
/**
* A struct representing a single attribute on an HTML tag. This is a
* name-value pair, but also includes information about source locations and
* original source text.
*/
typedef struct {
/**
* The namespace for the attribute. This will usually be
* GUMBO_ATTR_NAMESPACE_NONE, but some XLink/XMLNS/XML attributes take special
* values, per:
* http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#adjust-foreign-attributes
*/
GumboAttributeNamespaceEnum attr_namespace;
/**
* The name of the attribute. This is in a freshly-allocated buffer to deal
* with case-normalization, and is null-terminated.
*/
const char* name;
/**
* The original text of the attribute name, as a pointer into the original
* source buffer.
*/
GumboStringPiece original_name;
/**
* The value of the attribute. This is in a freshly-allocated buffer to deal
* with unescaping, and is null-terminated. It does not include any quotes
* that surround the attribute. If the attribute has no value (for example,
* 'selected' on a checkbox), this will be an empty string.
*/
const char* value;
/**
* The original text of the value of the attribute. This points into the
* original source buffer. It includes any quotes that surround the
* attribute, and you can look at original_value.data[0] and
* original_value.data[original_value.length - 1] to determine what the quote
* characters were. If the attribute has no value, this will be a 0-length
* string.
*/
GumboStringPiece original_value;
/** The starting position of the attribute name. */
GumboSourcePosition name_start;
/**
* The ending position of the attribute name. This is not always derivable
* from the starting position of the value because of the possibility of
* whitespace around the = sign.
*/
GumboSourcePosition name_end;
/** The starting position of the attribute value. */
GumboSourcePosition value_start;
/** The ending position of the attribute value. */
GumboSourcePosition value_end;
} GumboAttribute;
/**
* Given a vector of GumboAttributes, look up the one with the specified name
* and return it, or NULL if no such attribute exists. This uses a
* case-insensitive match, as HTML is case-insensitive.
*/
GumboAttribute* gumbo_get_attribute(const GumboVector* attrs, const char* name);
/**
* Enum denoting the type of node. This determines the type of the node.v
* union.
*/
typedef enum {
/** Document node. v will be a GumboDocument. */
GUMBO_NODE_DOCUMENT,
/** Element node. v will be a GumboElement. */
GUMBO_NODE_ELEMENT,
/** Text node. v will be a GumboText. */
GUMBO_NODE_TEXT,
/** CDATA node. v will be a GumboText. */
GUMBO_NODE_CDATA,
/** Comment node. v will be a GumboText, excluding comment delimiters. */
GUMBO_NODE_COMMENT,
/** Text node, where all contents is whitespace. v will be a GumboText. */
GUMBO_NODE_WHITESPACE,
/** Template node. This is separate from GUMBO_NODE_ELEMENT because many
* client libraries will want to ignore the contents of template nodes, as
* the spec suggests. Recursing on GUMBO_NODE_ELEMENT will do the right thing
* here, while clients that want to include template contents should also
* check for GUMBO_NODE_TEMPLATE. v will be a GumboElement. */
GUMBO_NODE_TEMPLATE
} GumboNodeType;
/**
* Forward declaration of GumboNode so it can be used recursively in
* GumboNode.parent.
*/
typedef struct GumboInternalNode GumboNode;
/**
* http://www.whatwg.org/specs/web-apps/current-work/complete/dom.html#quirks-mode
*/
typedef enum {
GUMBO_DOCTYPE_NO_QUIRKS,
GUMBO_DOCTYPE_QUIRKS,
GUMBO_DOCTYPE_LIMITED_QUIRKS
} GumboQuirksModeEnum;
/**
* Namespaces.
* Unlike in X(HT)ML, namespaces in HTML5 are not denoted by a prefix. Rather,
* anything inside an <svg> tag is in the SVG namespace, anything inside the
* <math> tag is in the MathML namespace, and anything else is inside the HTML
* namespace. No other namespaces are supported, so this can be an enum only.
*/
typedef enum {
GUMBO_NAMESPACE_HTML,
GUMBO_NAMESPACE_SVG,
GUMBO_NAMESPACE_MATHML
} GumboNamespaceEnum;
/**
* Parse flags.
* We track the reasons for parser insertion of nodes and store them in a
* bitvector in the node itself. This lets client code optimize out nodes that
* are implied by the HTML structure of the document, or flag constructs that
* may not be allowed by a style guide, or track the prevalence of incorrect or
* tricky HTML code.
*/
typedef enum {
/**
* A normal node - both start and end tags appear in the source, nothing has
* been reparented.
*/
GUMBO_INSERTION_NORMAL = 0,
/**
* A node inserted by the parser to fulfill some implicit insertion rule.
* This is usually set in addition to some other flag giving a more specific
* insertion reason; it's a generic catch-all term meaning "The start tag for
* this node did not appear in the document source".
*/
GUMBO_INSERTION_BY_PARSER = 1 << 0,
/**
* A flag indicating that the end tag for this node did not appear in the
* document source. Note that in some cases, you can still have
* parser-inserted nodes with an explicit end tag: for example, "Text</html>"
* has GUMBO_INSERTED_BY_PARSER set on the <html> node, but
* GUMBO_INSERTED_END_TAG_IMPLICITLY is unset, as the </html> tag actually
* exists. This flag will be set only if the end tag is completely missing;
* in some cases, the end tag may be misplaced (eg. a </body> tag with text
* afterwards), which will leave this flag unset and require clients to
* inspect the parse errors for that case.
*/
GUMBO_INSERTION_IMPLICIT_END_TAG = 1 << 1,
// Value 1 << 2 was for a flag that has since been removed.
/**
* A flag for nodes that are inserted because their presence is implied by
* other tags, eg. <html>, <head>, <body>, <tbody>, etc.
*/
GUMBO_INSERTION_IMPLIED = 1 << 3,
/**
* A flag for nodes that are converted from their end tag equivalents. For
* example, </p> when no paragraph is open implies that the parser should
* create a <p> tag and immediately close it, while </br> means the same thing
* as <br>.
*/
GUMBO_INSERTION_CONVERTED_FROM_END_TAG = 1 << 4,
/** A flag for nodes that are converted from the parse of an <isindex> tag. */
GUMBO_INSERTION_FROM_ISINDEX = 1 << 5,
/** A flag for <image> tags that are rewritten as <img>. */
GUMBO_INSERTION_FROM_IMAGE = 1 << 6,
/**
* A flag for nodes that are cloned as a result of the reconstruction of
* active formatting elements. This is set only on the clone; the initial
* portion of the formatting run is a NORMAL node with an IMPLICIT_END_TAG.
*/
GUMBO_INSERTION_RECONSTRUCTED_FORMATTING_ELEMENT = 1 << 7,
/** A flag for nodes that are cloned by the adoption agency algorithm. */
GUMBO_INSERTION_ADOPTION_AGENCY_CLONED = 1 << 8,
/** A flag for nodes that are moved by the adoption agency algorithm. */
GUMBO_INSERTION_ADOPTION_AGENCY_MOVED = 1 << 9,
/**
* A flag for nodes that have been foster-parented out of a table (or
* should've been foster-parented, if verbatim mode is set).
*/
GUMBO_INSERTION_FOSTER_PARENTED = 1 << 10,
} GumboParseFlags;
/**
* Information specific to document nodes.
*/
typedef struct {
/**
* An array of GumboNodes, containing the children of this element. This will
* normally consist of the <html> element and any comment nodes found.
* Pointers are owned.
*/
GumboVector /* GumboNode* */ children;
// True if there was an explicit doctype token as opposed to it being omitted.
bool has_doctype;
// Fields from the doctype token, copied verbatim.
const char* name;
const char* public_identifier;
const char* system_identifier;
/**
* Whether or not the document is in QuirksMode, as determined by the values
* in the GumboTokenDocType template.
*/
GumboQuirksModeEnum doc_type_quirks_mode;
} GumboDocument;
/**
* The struct used to represent TEXT, CDATA, COMMENT, and WHITESPACE elements.
* This contains just a block of text and its position.
*/
typedef struct {
/**
* The text of this node, after entities have been parsed and decoded. For
* comment/cdata nodes, this does not include the comment delimiters.
*/
const char* text;
/**
* The original text of this node, as a pointer into the original buffer. For
* comment/cdata nodes, this includes the comment delimiters.
*/
GumboStringPiece original_text;
/**
* The starting position of this node. This corresponds to the position of
* original_text, before entities are decoded.
* */
GumboSourcePosition start_pos;
} GumboText;
/**
* The struct used to represent all HTML elements. This contains information
* about the tag, attributes, and child nodes.
*/
typedef struct {
/**
* An array of GumboNodes, containing the children of this element. Pointers
* are owned.
*/
GumboVector /* GumboNode* */ children;
/** The GumboTag enum for this element. */
GumboTag tag;
/** The GumboNamespaceEnum for this element. */
GumboNamespaceEnum tag_namespace;
/**
* A GumboStringPiece pointing to the original tag text for this element,
* pointing directly into the source buffer. If the tag was inserted
* algorithmically (for example, <head> or <tbody> insertion), this will be a
* zero-length string.
*/
GumboStringPiece original_tag;
/**
* A GumboStringPiece pointing to the original end tag text for this element.
* If the end tag was inserted algorithmically, (for example, closing a
* self-closing tag), this will be a zero-length string.
*/
GumboStringPiece original_end_tag;
/** The source position for the start of the start tag. */
GumboSourcePosition start_pos;
/** The source position for the start of the end tag. */
GumboSourcePosition end_pos;
/**
* An array of GumboAttributes, containing the attributes for this tag in the
* order that they were parsed. Pointers are owned.
*/
GumboVector /* GumboAttribute* */ attributes;
} GumboElement;
/**
* A supertype for GumboElement and GumboText, so that we can include one
* generic type in lists of children and cast as necessary to subtypes.
*/
struct GumboInternalNode {
/** The type of node that this is. */
GumboNodeType type;
/** Pointer back to parent node. Not owned. */
GumboNode* parent;
/** The index within the parent's children vector of this node. */
size_t index_within_parent;
/**
* A bitvector of flags containing information about why this element was
* inserted into the parse tree, including a variety of special parse
* situations.
*/
GumboParseFlags parse_flags;
/** The actual node data. */
union {
GumboDocument document; // For GUMBO_NODE_DOCUMENT.
GumboElement element; // For GUMBO_NODE_ELEMENT.
GumboText text; // For everything else.
} v;
};
/**
* The type for an allocator function. Takes the 'userdata' member of the
* GumboParser struct as its first argument. Semantics should be the same as
* malloc, i.e. return a block of size_t bytes on success or NULL on failure.
* Allocating a block of 0 bytes behaves as per malloc.
*/
// TODO(jdtang): Add checks throughout the codebase for out-of-memory condition.
typedef void* (*GumboAllocatorFunction)(void* userdata, size_t size);
/**
* The type for a deallocator function. Takes the 'userdata' member of the
* GumboParser struct as its first argument.
*/
typedef void (*GumboDeallocatorFunction)(void* userdata, void* ptr);
/**
* Input struct containing configuration options for the parser.
* These let you specify alternate memory managers, provide different error
* handling, etc.
* Use kGumboDefaultOptions for sensible defaults, and only set what you need.
*/
typedef struct GumboInternalOptions {
/** A memory allocator function. Default: malloc. */
GumboAllocatorFunction allocator;
/** A memory deallocator function. Default: free. */
GumboDeallocatorFunction deallocator;
/**
* An opaque object that's passed in as the first argument to all callbacks
* used by this library. Default: NULL.
*/
void* userdata;
/**
* The tab-stop size, for computing positions in source code that uses tabs.
* Default: 8.
*/
int tab_stop;
/**
* Whether or not to stop parsing when the first error is encountered.
* Default: false.
*/
bool stop_on_first_error;
/**
* The maximum number of errors before the parser stops recording them. This
* is provided so that if the page is totally borked, we don't completely fill
* up the errors vector and exhaust memory with useless redundant errors. Set
* to -1 to disable the limit.
* Default: -1
*/
int max_errors;
/**
* The fragment context for parsing:
* https://html.spec.whatwg.org/multipage/syntax.html#parsing-html-fragments
*
* If GUMBO_TAG_LAST is passed here, it is assumed to be "no fragment", i.e.
* the regular parsing algorithm. Otherwise, pass the tag enum for the
* intended parent of the parsed fragment. We use just the tag enum rather
* than a full node because that's enough to set all the parsing context we
* need, and it provides some additional flexibility for client code to act as
* if parsing a fragment even when a full HTML tree isn't available.
*
* Default: GUMBO_TAG_LAST
*/
GumboTag fragment_context;
/**
* The namespace for the fragment context. This lets client code
* differentiate between, say, parsing a <title> tag in SVG vs. parsing it in
* HTML.
* Default: GUMBO_NAMESPACE_HTML
*/
GumboNamespaceEnum fragment_namespace;
} GumboOptions;
/** Default options struct; use this with gumbo_parse_with_options. */
extern const GumboOptions kGumboDefaultOptions;
/** The output struct containing the results of the parse. */
typedef struct GumboInternalOutput {
/**
* Pointer to the document node. This is a GumboNode of type NODE_DOCUMENT
* that contains the entire document as its child.
*/
GumboNode* document;
/**
* Pointer to the root node. This the <html> tag that forms the root of the
* document.
*/
GumboNode* root;
/**
* A list of errors that occurred during the parse.
* NOTE: In version 1.0 of this library, the API for errors hasn't been fully
* fleshed out and may change in the future. For this reason, the GumboError
* header isn't part of the public API. Contact us if you need errors
* reported so we can work out something appropriate for your use-case.
*/
GumboVector /* GumboError */ errors;
} GumboOutput;
/**
* Parses a buffer of UTF8 text into an GumboNode parse tree. The buffer must
* live at least as long as the parse tree, as some fields (eg. original_text)
* point directly into the original buffer.
*
* This doesn't support buffers longer than 4 gigabytes.
*/
GumboOutput* gumbo_parse(const char* buffer);
/**
* Extended version of gumbo_parse that takes an explicit options structure,
* buffer, and length.
*/
GumboOutput* gumbo_parse_with_options(
const GumboOptions* options, const char* buffer, size_t buffer_length);
/** Release the memory used for the parse tree & parse errors. */
void gumbo_destroy_output(const GumboOptions* options, GumboOutput* output);
#ifdef __cplusplus
}
#endif
#endif // GUMBO_GUMBO_H_

View File

@ -0,0 +1,37 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
#ifndef GUMBO_ATTRIBUTE_H_
#define GUMBO_ATTRIBUTE_H_
#include "gumbo.h"
#ifdef __cplusplus
extern "C" {
#endif
struct GumboInternalParser;
// Release the memory used for an GumboAttribute, including the attribute
// itself.
void gumbo_destroy_attribute(
struct GumboInternalParser* parser, GumboAttribute* attribute);
#ifdef __cplusplus
}
#endif
#endif // GUMBO_ATTRIBUTE_H_

View File

@ -0,0 +1,60 @@
// Copyright 2011 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
//
// Internal header for character reference handling; this should not be exposed
// transitively by any public API header. This is why the functions aren't
// namespaced.
#ifndef GUMBO_CHAR_REF_H_
#define GUMBO_CHAR_REF_H_
#include <stdbool.h>
#ifdef __cplusplus
extern "C" {
#endif
struct GumboInternalParser;
struct GumboInternalUtf8Iterator;
// Value that indicates no character was produced.
extern const int kGumboNoChar;
// Certain named character references generate two codepoints, not one, and so
// the consume_char_ref subroutine needs to return this instead of an int. The
// first field will be kGumboNoChar if no character reference was found; the
// second field will be kGumboNoChar if that is the case or if the character
// reference returns only a single codepoint.
typedef struct {
int first;
int second;
} OneOrTwoCodepoints;
// Implements the "consume a character reference" section of the spec.
// This reads in characters from the input as necessary, and fills in a
// OneOrTwoCodepoints struct containing the characters read. It may add parse
// errors to the GumboParser's errors vector, if the spec calls for it. Pass a
// space for the "additional allowed char" when the spec says "with no
// additional allowed char". Returns false on parse error, true otherwise.
bool consume_char_ref(struct GumboInternalParser* parser,
struct GumboInternalUtf8Iterator* input, int additional_allowed_char,
bool is_in_attribute, OneOrTwoCodepoints* output);
#ifdef __cplusplus
}
#endif
#endif // GUMBO_CHAR_REF_H_

View File

@ -0,0 +1,225 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
//
// Error types, enums, and handling functions.
#ifndef GUMBO_ERROR_H_
#define GUMBO_ERROR_H_
#ifdef _MSC_VER
#define _CRT_SECURE_NO_WARNINGS
#endif
#include <stdint.h>
#include "gumbo.h"
#include "insertion_mode.h"
#include "string_buffer.h"
#include "token_type.h"
#ifdef __cplusplus
extern "C" {
#endif
struct GumboInternalParser;
typedef enum {
GUMBO_ERR_UTF8_INVALID,
GUMBO_ERR_UTF8_TRUNCATED,
GUMBO_ERR_UTF8_NULL,
GUMBO_ERR_NUMERIC_CHAR_REF_NO_DIGITS,
GUMBO_ERR_NUMERIC_CHAR_REF_WITHOUT_SEMICOLON,
GUMBO_ERR_NUMERIC_CHAR_REF_INVALID,
GUMBO_ERR_NAMED_CHAR_REF_WITHOUT_SEMICOLON,
GUMBO_ERR_NAMED_CHAR_REF_INVALID,
GUMBO_ERR_TAG_STARTS_WITH_QUESTION,
GUMBO_ERR_TAG_EOF,
GUMBO_ERR_TAG_INVALID,
GUMBO_ERR_CLOSE_TAG_EMPTY,
GUMBO_ERR_CLOSE_TAG_EOF,
GUMBO_ERR_CLOSE_TAG_INVALID,
GUMBO_ERR_SCRIPT_EOF,
GUMBO_ERR_ATTR_NAME_EOF,
GUMBO_ERR_ATTR_NAME_INVALID,
GUMBO_ERR_ATTR_DOUBLE_QUOTE_EOF,
GUMBO_ERR_ATTR_SINGLE_QUOTE_EOF,
GUMBO_ERR_ATTR_UNQUOTED_EOF,
GUMBO_ERR_ATTR_UNQUOTED_RIGHT_BRACKET,
GUMBO_ERR_ATTR_UNQUOTED_EQUALS,
GUMBO_ERR_ATTR_AFTER_EOF,
GUMBO_ERR_ATTR_AFTER_INVALID,
GUMBO_ERR_DUPLICATE_ATTR,
GUMBO_ERR_SOLIDUS_EOF,
GUMBO_ERR_SOLIDUS_INVALID,
GUMBO_ERR_DASHES_OR_DOCTYPE,
GUMBO_ERR_COMMENT_EOF,
GUMBO_ERR_COMMENT_INVALID,
GUMBO_ERR_COMMENT_BANG_AFTER_DOUBLE_DASH,
GUMBO_ERR_COMMENT_DASH_AFTER_DOUBLE_DASH,
GUMBO_ERR_COMMENT_SPACE_AFTER_DOUBLE_DASH,
GUMBO_ERR_COMMENT_END_BANG_EOF,
GUMBO_ERR_DOCTYPE_EOF,
GUMBO_ERR_DOCTYPE_INVALID,
GUMBO_ERR_DOCTYPE_SPACE,
GUMBO_ERR_DOCTYPE_RIGHT_BRACKET,
GUMBO_ERR_DOCTYPE_SPACE_OR_RIGHT_BRACKET,
GUMBO_ERR_DOCTYPE_END,
GUMBO_ERR_PARSER,
GUMBO_ERR_UNACKNOWLEDGED_SELF_CLOSING_TAG,
} GumboErrorType;
// Additional data for duplicated attributes.
typedef struct GumboInternalDuplicateAttrError {
// The name of the attribute. Owned by this struct.
const char* name;
// The (0-based) index within the attributes vector of the original
// occurrence.
unsigned int original_index;
// The (0-based) index where the new occurrence would be.
unsigned int new_index;
} GumboDuplicateAttrError;
// A simplified representation of the tokenizer state, designed to be more
// useful to clients of this library than the internal representation. This
// condenses the actual states used in the tokenizer state machine into a few
// values that will be familiar to users of HTML.
typedef enum {
GUMBO_ERR_TOKENIZER_DATA,
GUMBO_ERR_TOKENIZER_CHAR_REF,
GUMBO_ERR_TOKENIZER_RCDATA,
GUMBO_ERR_TOKENIZER_RAWTEXT,
GUMBO_ERR_TOKENIZER_PLAINTEXT,
GUMBO_ERR_TOKENIZER_SCRIPT,
GUMBO_ERR_TOKENIZER_TAG,
GUMBO_ERR_TOKENIZER_SELF_CLOSING_TAG,
GUMBO_ERR_TOKENIZER_ATTR_NAME,
GUMBO_ERR_TOKENIZER_ATTR_VALUE,
GUMBO_ERR_TOKENIZER_MARKUP_DECLARATION,
GUMBO_ERR_TOKENIZER_COMMENT,
GUMBO_ERR_TOKENIZER_DOCTYPE,
GUMBO_ERR_TOKENIZER_CDATA,
} GumboTokenizerErrorState;
// Additional data for tokenizer errors.
// This records the current state and codepoint encountered - this is usually
// enough to reconstruct what went wrong and provide a friendly error message.
typedef struct GumboInternalTokenizerError {
// The bad codepoint encountered.
int codepoint;
// The state that the tokenizer was in at the time.
GumboTokenizerErrorState state;
} GumboTokenizerError;
// Additional data for parse errors.
typedef struct GumboInternalParserError {
// The type of input token that resulted in this error.
GumboTokenType input_type;
// The HTML tag of the input token. TAG_UNKNOWN if this was not a tag token.
GumboTag input_tag;
// The insertion mode that the parser was in at the time.
GumboInsertionMode parser_state;
// The tag stack at the point of the error. Note that this is an GumboVector
// of GumboTag's *stored by value* - cast the void* to an GumboTag directly to
// get at the tag.
GumboVector /* GumboTag */ tag_stack;
} GumboParserError;
// The overall error struct representing an error in decoding/tokenizing/parsing
// the HTML. This contains an enumerated type flag, a source position, and then
// a union of fields containing data specific to the error.
typedef struct GumboInternalError {
// The type of error.
GumboErrorType type;
// The position within the source file where the error occurred.
GumboSourcePosition position;
// A pointer to the byte within the original source file text where the error
// occurred (note that this is not the same as position.offset, as that gives
// character-based instead of byte-based offsets).
const char* original_text;
// Type-specific error information.
union {
// The code point we encountered, for:
// * GUMBO_ERR_UTF8_INVALID
// * GUMBO_ERR_UTF8_TRUNCATED
// * GUMBO_ERR_NUMERIC_CHAR_REF_WITHOUT_SEMICOLON
// * GUMBO_ERR_NUMERIC_CHAR_REF_INVALID
uint64_t codepoint;
// Tokenizer errors.
GumboTokenizerError tokenizer;
// Short textual data, for:
// * GUMBO_ERR_NAMED_CHAR_REF_WITHOUT_SEMICOLON
// * GUMBO_ERR_NAMED_CHAR_REF_INVALID
GumboStringPiece text;
// Duplicate attribute data, for GUMBO_ERR_DUPLICATE_ATTR.
GumboDuplicateAttrError duplicate_attr;
// Parser state, for GUMBO_ERR_PARSER and
// GUMBO_ERR_UNACKNOWLEDGE_SELF_CLOSING_TAG.
struct GumboInternalParserError parser;
} v;
} GumboError;
// Adds a new error to the parser's error list, and returns a pointer to it so
// that clients can fill out the rest of its fields. May return NULL if we're
// already over the max_errors field specified in GumboOptions.
GumboError* gumbo_add_error(struct GumboInternalParser* parser);
// Initializes the errors vector in the parser.
void gumbo_init_errors(struct GumboInternalParser* errors);
// Frees all the errors in the 'errors_' field of the parser.
void gumbo_destroy_errors(struct GumboInternalParser* errors);
// Frees the memory used for a single GumboError.
void gumbo_error_destroy(struct GumboInternalParser* parser, GumboError* error);
// Prints an error to a string. This fills an empty GumboStringBuffer with a
// freshly-allocated buffer containing the error message text. The caller is
// responsible for deleting the buffer. (Note that the buffer is allocated with
// the allocator specified in the GumboParser ~config and hence should be freed
// by gumbo_parser_deallocate().)
void gumbo_error_to_string(struct GumboInternalParser* parser,
const GumboError* error, GumboStringBuffer* output);
// Prints a caret diagnostic to a string. This fills an empty GumboStringBuffer
// with a freshly-allocated buffer containing the error message text. The
// caller is responsible for deleting the buffer. (Note that the buffer is
// allocated with the allocator specified in the GumboParser ~config and hence
// should be freed by gumbo_parser_deallocate().)
void gumbo_caret_diagnostic_to_string(struct GumboInternalParser* parser,
const GumboError* error, const char* source_text,
GumboStringBuffer* output);
// Like gumbo_caret_diagnostic_to_string, but prints the text to stdout instead
// of writing to a string.
void gumbo_print_caret_diagnostic(struct GumboInternalParser* parser,
const GumboError* error, const char* source_text);
#ifdef __cplusplus
}
#endif
#endif // GUMBO_ERROR_H_

View File

@ -0,0 +1,57 @@
// Copyright 2011 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
#ifndef GUMBO_INSERTION_MODE_H_
#define GUMBO_INSERTION_MODE_H_
#ifdef __cplusplus
extern "C" {
#endif
// http://www.whatwg.org/specs/web-apps/current-work/complete/parsing.html#insertion-mode
// If new enum values are added, be sure to update the kTokenHandlers dispatch
// table in parser.c.
typedef enum {
GUMBO_INSERTION_MODE_INITIAL,
GUMBO_INSERTION_MODE_BEFORE_HTML,
GUMBO_INSERTION_MODE_BEFORE_HEAD,
GUMBO_INSERTION_MODE_IN_HEAD,
GUMBO_INSERTION_MODE_IN_HEAD_NOSCRIPT,
GUMBO_INSERTION_MODE_AFTER_HEAD,
GUMBO_INSERTION_MODE_IN_BODY,
GUMBO_INSERTION_MODE_TEXT,
GUMBO_INSERTION_MODE_IN_TABLE,
GUMBO_INSERTION_MODE_IN_TABLE_TEXT,
GUMBO_INSERTION_MODE_IN_CAPTION,
GUMBO_INSERTION_MODE_IN_COLUMN_GROUP,
GUMBO_INSERTION_MODE_IN_TABLE_BODY,
GUMBO_INSERTION_MODE_IN_ROW,
GUMBO_INSERTION_MODE_IN_CELL,
GUMBO_INSERTION_MODE_IN_SELECT,
GUMBO_INSERTION_MODE_IN_SELECT_IN_TABLE,
GUMBO_INSERTION_MODE_IN_TEMPLATE,
GUMBO_INSERTION_MODE_AFTER_BODY,
GUMBO_INSERTION_MODE_IN_FRAMESET,
GUMBO_INSERTION_MODE_AFTER_FRAMESET,
GUMBO_INSERTION_MODE_AFTER_AFTER_BODY,
GUMBO_INSERTION_MODE_AFTER_AFTER_FRAMESET
} GumboInsertionMode;
#ifdef __cplusplus
} // extern C
#endif
#endif // GUMBO_INSERTION_MODE_H_

View File

@ -0,0 +1,57 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
//
// Contains the definition of the top-level GumboParser structure that's
// threaded through basically every internal function in the library.
#ifndef GUMBO_PARSER_H_
#define GUMBO_PARSER_H_
#ifdef __cplusplus
extern "C" {
#endif
struct GumboInternalParserState;
struct GumboInternalOutput;
struct GumboInternalOptions;
struct GumboInternalTokenizerState;
// An overarching struct that's threaded through (nearly) all functions in the
// library, OOP-style. This gives each function access to the options and
// output, along with any internal state needed for the parse.
typedef struct GumboInternalParser {
// Settings for this parse run.
const struct GumboInternalOptions* _options;
// Output for the parse.
struct GumboInternalOutput* _output;
// The internal tokenizer state, defined as a pointer to avoid a cyclic
// dependency on html5tokenizer.h. The main parse routine is responsible for
// initializing this on parse start, and destroying it on parse end.
// End-users will never see a non-garbage value in this pointer.
struct GumboInternalTokenizerState* _tokenizer_state;
// The internal parser state. Initialized on parse start and destroyed on
// parse end; end-users will never see a non-garbage value in this pointer.
struct GumboInternalParserState* _parser_state;
} GumboParser;
#ifdef __cplusplus
}
#endif
#endif // GUMBO_PARSER_H_

View File

@ -0,0 +1,84 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
//
#ifndef GUMBO_STRING_BUFFER_H_
#define GUMBO_STRING_BUFFER_H_
#include <stdbool.h>
#include <stddef.h>
#include "gumbo.h"
#ifdef __cplusplus
extern "C" {
#endif
struct GumboInternalParser;
// A struct representing a mutable, growable string. This consists of a
// heap-allocated buffer that may grow (by doubling) as necessary. When
// converting to a string, this allocates a new buffer that is only as long as
// it needs to be. Note that the internal buffer here is *not* nul-terminated,
// so be sure not to use ordinary string manipulation functions on it.
typedef struct {
// A pointer to the beginning of the string. NULL iff length == 0.
char* data;
// The length of the string fragment, in bytes. May be zero.
size_t length;
// The capacity of the buffer, in bytes.
size_t capacity;
} GumboStringBuffer;
// Initializes a new GumboStringBuffer.
void gumbo_string_buffer_init(
struct GumboInternalParser* parser, GumboStringBuffer* output);
// Ensures that the buffer contains at least a certain amount of space. Most
// useful with snprintf and the other length-delimited string functions, which
// may want to write directly into the buffer.
void gumbo_string_buffer_reserve(struct GumboInternalParser* parser,
size_t min_capacity, GumboStringBuffer* output);
// Appends a single Unicode codepoint onto the end of the GumboStringBuffer.
// This is essentially a UTF-8 encoder, and may add 1-4 bytes depending on the
// value of the codepoint.
void gumbo_string_buffer_append_codepoint(
struct GumboInternalParser* parser, int c, GumboStringBuffer* output);
// Appends a string onto the end of the GumboStringBuffer.
void gumbo_string_buffer_append_string(struct GumboInternalParser* parser,
GumboStringPiece* str, GumboStringBuffer* output);
// Converts this string buffer to const char*, alloctaing a new buffer for it.
char* gumbo_string_buffer_to_string(
struct GumboInternalParser* parser, GumboStringBuffer* input);
// Reinitialize this string buffer. This clears it by setting length=0. It
// does not zero out the buffer itself.
void gumbo_string_buffer_clear(
struct GumboInternalParser* parser, GumboStringBuffer* input);
// Deallocates this GumboStringBuffer.
void gumbo_string_buffer_destroy(
struct GumboInternalParser* parser, GumboStringBuffer* buffer);
#ifdef __cplusplus
}
#endif
#endif // GUMBO_STRING_BUFFER_H_

View File

@ -0,0 +1,38 @@
// Copyright 2010 Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Author: jdtang@google.com (Jonathan Tang)
#ifndef GUMBO_STRING_PIECE_H_
#define GUMBO_STRING_PIECE_H_
#include "gumbo.h"
#ifdef __cplusplus
extern "C" {
#endif
struct GumboInternalParser;
// Performs a deep-copy of an GumboStringPiece, allocating a fresh buffer in the
// destination and copying over the characters from source. Dest should be
// empty, with no buffer allocated; otherwise, this leaks it.
void gumbo_string_copy(struct GumboInternalParser* parser,
GumboStringPiece* dest, const GumboStringPiece* source);
#ifdef __cplusplus
}
#endif
#endif // GUMBO_STRING_PIECE_H_

View File

@ -0,0 +1,153 @@
// Generated via `gentags.py src/tag.in`.
// Do not edit; edit src/tag.in instead.
// clang-format off
GUMBO_TAG_HTML,
GUMBO_TAG_HEAD,
GUMBO_TAG_TITLE,
GUMBO_TAG_BASE,
GUMBO_TAG_LINK,
GUMBO_TAG_META,
GUMBO_TAG_STYLE,
GUMBO_TAG_SCRIPT,
GUMBO_TAG_NOSCRIPT,
GUMBO_TAG_TEMPLATE,
GUMBO_TAG_BODY,
GUMBO_TAG_ARTICLE,
GUMBO_TAG_SECTION,
GUMBO_TAG_NAV,
GUMBO_TAG_ASIDE,
GUMBO_TAG_H1,
GUMBO_TAG_H2,
GUMBO_TAG_H3,
GUMBO_TAG_H4,
GUMBO_TAG_H5,
GUMBO_TAG_H6,
GUMBO_TAG_HGROUP,
GUMBO_TAG_HEADER,
GUMBO_TAG_FOOTER,
GUMBO_TAG_ADDRESS,
GUMBO_TAG_P,
GUMBO_TAG_HR,
GUMBO_TAG_PRE,
GUMBO_TAG_BLOCKQUOTE,
GUMBO_TAG_OL,
GUMBO_TAG_UL,
GUMBO_TAG_LI,
GUMBO_TAG_DL,
GUMBO_TAG_DT,
GUMBO_TAG_DD,
GUMBO_TAG_FIGURE,
GUMBO_TAG_FIGCAPTION,
GUMBO_TAG_MAIN,
GUMBO_TAG_DIV,
GUMBO_TAG_A,
GUMBO_TAG_EM,
GUMBO_TAG_STRONG,
GUMBO_TAG_SMALL,
GUMBO_TAG_S,
GUMBO_TAG_CITE,
GUMBO_TAG_Q,
GUMBO_TAG_DFN,
GUMBO_TAG_ABBR,
GUMBO_TAG_DATA,
GUMBO_TAG_TIME,
GUMBO_TAG_CODE,
GUMBO_TAG_VAR,
GUMBO_TAG_SAMP,
GUMBO_TAG_KBD,
GUMBO_TAG_SUB,
GUMBO_TAG_SUP,
GUMBO_TAG_I,
GUMBO_TAG_B,
GUMBO_TAG_U,
GUMBO_TAG_MARK,
GUMBO_TAG_RUBY,
GUMBO_TAG_RT,
GUMBO_TAG_RP,
GUMBO_TAG_BDI,
GUMBO_TAG_BDO,
GUMBO_TAG_SPAN,
GUMBO_TAG_BR,
GUMBO_TAG_WBR,
GUMBO_TAG_INS,
GUMBO_TAG_DEL,
GUMBO_TAG_IMAGE,
GUMBO_TAG_IMG,
GUMBO_TAG_IFRAME,
GUMBO_TAG_EMBED,
GUMBO_TAG_OBJECT,
GUMBO_TAG_PARAM,
GUMBO_TAG_VIDEO,
GUMBO_TAG_AUDIO,
GUMBO_TAG_SOURCE,
GUMBO_TAG_TRACK,
GUMBO_TAG_CANVAS,
GUMBO_TAG_MAP,
GUMBO_TAG_AREA,
GUMBO_TAG_MATH,
GUMBO_TAG_MI,
GUMBO_TAG_MO,
GUMBO_TAG_MN,
GUMBO_TAG_MS,
GUMBO_TAG_MTEXT,
GUMBO_TAG_MGLYPH,
GUMBO_TAG_MALIGNMARK,
GUMBO_TAG_ANNOTATION_XML,
GUMBO_TAG_SVG,
GUMBO_TAG_FOREIGNOBJECT,
GUMBO_TAG_DESC,
GUMBO_TAG_TABLE,
GUMBO_TAG_CAPTION,
GUMBO_TAG_COLGROUP,
GUMBO_TAG_COL,
GUMBO_TAG_TBODY,
GUMBO_TAG_THEAD,
GUMBO_TAG_TFOOT,
GUMBO_TAG_TR,
GUMBO_TAG_TD,
GUMBO_TAG_TH,
GUMBO_TAG_FORM,
GUMBO_TAG_FIELDSET,
GUMBO_TAG_LEGEND,
GUMBO_TAG_LABEL,
GUMBO_TAG_INPUT,
GUMBO_TAG_BUTTON,
GUMBO_TAG_SELECT,
GUMBO_TAG_DATALIST,
GUMBO_TAG_OPTGROUP,
GUMBO_TAG_OPTION,
GUMBO_TAG_TEXTAREA,
GUMBO_TAG_KEYGEN,
GUMBO_TAG_OUTPUT,
GUMBO_TAG_PROGRESS,
GUMBO_TAG_METER,
GUMBO_TAG_DETAILS,
GUMBO_TAG_SUMMARY,
GUMBO_TAG_MENU,
GUMBO_TAG_MENUITEM,
GUMBO_TAG_APPLET,
GUMBO_TAG_ACRONYM,
GUMBO_TAG_BGSOUND,
GUMBO_TAG_DIR,
GUMBO_TAG_FRAME,
GUMBO_TAG_FRAMESET,
GUMBO_TAG_NOFRAMES,
GUMBO_TAG_ISINDEX,
GUMBO_TAG_LISTING,
GUMBO_TAG_XMP,
GUMBO_TAG_NEXTID,
GUMBO_TAG_NOEMBED,
GUMBO_TAG_PLAINTEXT,
GUMBO_TAG_RB,
GUMBO_TAG_STRIKE,
GUMBO_TAG_BASEFONT,
GUMBO_TAG_BIG,
GUMBO_TAG_BLINK,
GUMBO_TAG_CENTER,
GUMBO_TAG_FONT,
GUMBO_TAG_MARQUEE,
GUMBO_TAG_MULTICOL,
GUMBO_TAG_NOBR,
GUMBO_TAG_SPACER,
GUMBO_TAG_TT,
GUMBO_TAG_RTC,

View File

@ -0,0 +1,105 @@
static unsigned int tag_hash(
register const char *str, register unsigned int len) {
static unsigned short asso_values[] = {296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 6, 4, 3, 1, 1, 0,
1, 0, 0, 296, 296, 296, 296, 296, 296, 296, 22, 73, 151, 4, 13, 59, 65, 2,
69, 0, 134, 9, 16, 52, 55, 28, 101, 0, 1, 6, 63, 126, 104, 93, 124, 296,
296, 296, 296, 296, 296, 296, 22, 73, 151, 4, 13, 59, 65, 2, 69, 0, 134,
9, 16, 52, 55, 28, 101, 0, 1, 6, 63, 126, 104, 93, 124, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296,
296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296, 296};
register unsigned int hval = len;
switch (hval) {
default:
hval += asso_values[(unsigned char) str[1] + 3];
/*FALLTHROUGH*/
case 1:
hval += asso_values[(unsigned char) str[0]];
break;
}
return hval + asso_values[(unsigned char) str[len - 1]];
}
static const unsigned char kGumboTagMap[] = {GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_S, GUMBO_TAG_H6, GUMBO_TAG_H5, GUMBO_TAG_H4,
GUMBO_TAG_H3, GUMBO_TAG_SPACER, GUMBO_TAG_H2, GUMBO_TAG_HEADER,
GUMBO_TAG_H1, GUMBO_TAG_HEAD, GUMBO_TAG_LAST, GUMBO_TAG_DETAILS,
GUMBO_TAG_SELECT, GUMBO_TAG_DIR, GUMBO_TAG_LAST, GUMBO_TAG_DEL,
GUMBO_TAG_LAST, GUMBO_TAG_SOURCE, GUMBO_TAG_LEGEND, GUMBO_TAG_DATALIST,
GUMBO_TAG_METER, GUMBO_TAG_MGLYPH, GUMBO_TAG_LAST, GUMBO_TAG_MATH,
GUMBO_TAG_LABEL, GUMBO_TAG_TABLE, GUMBO_TAG_TEMPLATE, GUMBO_TAG_LAST,
GUMBO_TAG_RP, GUMBO_TAG_TIME, GUMBO_TAG_TITLE, GUMBO_TAG_DATA,
GUMBO_TAG_APPLET, GUMBO_TAG_HGROUP, GUMBO_TAG_SAMP, GUMBO_TAG_TEXTAREA,
GUMBO_TAG_ABBR, GUMBO_TAG_MARQUEE, GUMBO_TAG_LAST, GUMBO_TAG_MENUITEM,
GUMBO_TAG_SMALL, GUMBO_TAG_META, GUMBO_TAG_A, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_EMBED,
GUMBO_TAG_MAP, GUMBO_TAG_LAST, GUMBO_TAG_PARAM, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_NOBR, GUMBO_TAG_P, GUMBO_TAG_SPAN, GUMBO_TAG_EM,
GUMBO_TAG_LAST, GUMBO_TAG_NOFRAMES, GUMBO_TAG_SECTION, GUMBO_TAG_NOEMBED,
GUMBO_TAG_NEXTID, GUMBO_TAG_FOOTER, GUMBO_TAG_NOSCRIPT, GUMBO_TAG_HR,
GUMBO_TAG_LAST, GUMBO_TAG_FONT, GUMBO_TAG_DL, GUMBO_TAG_TR,
GUMBO_TAG_SCRIPT, GUMBO_TAG_MO, GUMBO_TAG_LAST, GUMBO_TAG_DD,
GUMBO_TAG_MAIN, GUMBO_TAG_TD, GUMBO_TAG_FOREIGNOBJECT, GUMBO_TAG_FORM,
GUMBO_TAG_OBJECT, GUMBO_TAG_LAST, GUMBO_TAG_FIELDSET, GUMBO_TAG_LAST,
GUMBO_TAG_BGSOUND, GUMBO_TAG_MENU, GUMBO_TAG_TFOOT, GUMBO_TAG_FIGURE,
GUMBO_TAG_RB, GUMBO_TAG_LI, GUMBO_TAG_LISTING, GUMBO_TAG_BASEFONT,
GUMBO_TAG_OPTGROUP, GUMBO_TAG_LAST, GUMBO_TAG_BASE, GUMBO_TAG_ADDRESS,
GUMBO_TAG_MI, GUMBO_TAG_LAST, GUMBO_TAG_PLAINTEXT, GUMBO_TAG_LAST,
GUMBO_TAG_PROGRESS, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_ACRONYM, GUMBO_TAG_ARTICLE, GUMBO_TAG_LAST, GUMBO_TAG_PRE,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_AREA,
GUMBO_TAG_RT, GUMBO_TAG_LAST, GUMBO_TAG_OPTION, GUMBO_TAG_IMAGE,
GUMBO_TAG_DT, GUMBO_TAG_LAST, GUMBO_TAG_TT, GUMBO_TAG_HTML, GUMBO_TAG_WBR,
GUMBO_TAG_OL, GUMBO_TAG_LAST, GUMBO_TAG_STYLE, GUMBO_TAG_STRIKE,
GUMBO_TAG_SUP, GUMBO_TAG_MULTICOL, GUMBO_TAG_U, GUMBO_TAG_DFN, GUMBO_TAG_UL,
GUMBO_TAG_FIGCAPTION, GUMBO_TAG_MTEXT, GUMBO_TAG_LAST, GUMBO_TAG_VAR,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_FRAMESET, GUMBO_TAG_LAST,
GUMBO_TAG_BR, GUMBO_TAG_I, GUMBO_TAG_FRAME, GUMBO_TAG_LAST, GUMBO_TAG_DIV,
GUMBO_TAG_LAST, GUMBO_TAG_TH, GUMBO_TAG_MS, GUMBO_TAG_ANNOTATION_XML,
GUMBO_TAG_B, GUMBO_TAG_TBODY, GUMBO_TAG_THEAD, GUMBO_TAG_BIG,
GUMBO_TAG_BLOCKQUOTE, GUMBO_TAG_XMP, GUMBO_TAG_LAST, GUMBO_TAG_KBD,
GUMBO_TAG_LAST, GUMBO_TAG_LINK, GUMBO_TAG_IFRAME, GUMBO_TAG_MARK,
GUMBO_TAG_CENTER, GUMBO_TAG_OUTPUT, GUMBO_TAG_DESC, GUMBO_TAG_CANVAS,
GUMBO_TAG_COL, GUMBO_TAG_MALIGNMARK, GUMBO_TAG_IMG, GUMBO_TAG_ASIDE,
GUMBO_TAG_LAST, GUMBO_TAG_CODE, GUMBO_TAG_LAST, GUMBO_TAG_SUB, GUMBO_TAG_MN,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_INS, GUMBO_TAG_AUDIO,
GUMBO_TAG_STRONG, GUMBO_TAG_CITE, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_INPUT, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_NAV, GUMBO_TAG_LAST, GUMBO_TAG_COLGROUP,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_SVG, GUMBO_TAG_KEYGEN, GUMBO_TAG_VIDEO,
GUMBO_TAG_BDO, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_BODY, GUMBO_TAG_LAST, GUMBO_TAG_Q, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_TRACK,
GUMBO_TAG_LAST, GUMBO_TAG_BDI, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_CAPTION, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_RUBY, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_BUTTON,
GUMBO_TAG_SUMMARY, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_RTC, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_BLINK, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_LAST,
GUMBO_TAG_LAST, GUMBO_TAG_LAST, GUMBO_TAG_ISINDEX};

Some files were not shown because too many files have changed in this diff Show More