uciparse.uci
============

.. py:module:: uciparse.uci

.. autoapi-nested-parse::

   Implements code to parse and emit the OpenWRT UCI_ configuration format.

   Normalizing a File
   ==================

   When normalizing a file, the goal is to standardize indenting, comment
   placement, and quoting, without changing the semantics of the file. We do this
   by reading in the file per the spec, and then emitting the same configuration
   in a standard way.

   We always emit identifiers unquoted.  We always emit values quoted, using a single
   quote unless the value contains a single quote, in which case we'll use double
   quotes.  We always indent 4 spaces.  We always put a blank line before a config
   section.  We always put a single space between fields on a single line and two
   spaces before a comment.  None of this is configurable.

   The one thing we can't handle well is a standalone comment.  Since the file
   format is line-oriented, we don't really have any context for comments.  The
   best we can do is infer that a comment is supposed to be indented at the same
   level as an option or list if there was any whitespace before the leading ``#``
   character when we found the comment.


   Parser Design
   =============

   These regular expressions tell us what sort of line we're dealing with:

       - **Empty line:** ``(^\s*$)``
       - **Comment-only line:** ``(^\s*#.*$)``
       - **Package line:** ``(^\s*)(package)(\s+)(.*?)(\s*$)``
       - **Config line:** ``(^\s*)(config)(\s+)(.*?)(\s*$)``
       - **Option line:** ``(^\s*)(option)(\s+)(.*?)(\s*$)``
       - **List line:** ``(^\s*)(list)(\s+)(.*?)(\s*$)``

   Any line that does not match one of these regular expressions is an invalid
   line per the specification.

   We can simplify these regular expressions into a single check:

      ``(^\s*$)|((^\s*)(#)(.*$))|((^\s*)(package|config|option|list)(\s+)(.*?)(\s*$))``

   With this regular expression, if group #4 is `#`, then we have a comment,
   with the comment text in group #5 and leading whitespace in group #3.  Otherwise,
   group #8 gives us the type of the line (``package``, ``config``, ``option``
   or ``list``) and group #10 gives us the remainder of the line after the type.

   Next, we need to parse the data on each line according to the individual rules
   for the type of line.  The UCI restrictions on identifiers are enforced,
   including that empty identifiers are not legal.  We also enforce quoting rules,
   including that quotes must match.

   The spec is silent about embedded and escaped quotes within option values.
   I've chosen to assume that a double-quoted string may contain single quotes and
   vice-versa (like in Python or Perl) but that escaped quotes are not allowed.

   We do not validate boolean values.  The spec supports a specific list, but you
   can't really identify from looking at the file whether an option is supposed to
   be a boolean or just a string.

   Regardless, if the first regular expression matches a line, and the second
   regular expression does not, then the line isn't valid and we can't process it.
   If we can't process any line, we'll bomb out and refuse to process the file
   at all.

   For a package line, we can use this regular expression:

   ``(^)((([\"'])([a-zA-Z0-9_-]+)(?:\4))|([a-zA-Z0-9_-]+))((\s*)(#.*))?($)``

   If the field is quoted, this yields the package name in group #5.  If the field
   is not quoted, this yields the package name in group #6.  The comment, if it
   exists, will be in group #9.

   For a config line, we can use this regular expression:

   ``(^)((([\"'])([a-zA-Z0-9_-]+)(?:\4))|([a-zA-Z0-9_-]+))((\s+)((([\"'])([a-zA-Z0-9_-]+)(?:\11))|([a-zA-Z0-9_-]+)))?((\s*)(#.*))?($)``

   If the first field is quoted, this yields the section type in group #5.  If the
   first field is not quoted, this yields the section type in group #6.  If the
   second field exists and is quoted, this yields the the section name in group
   #12.  If second field is not quoted, this yields the section name in group #9.
   The trailing comment, if it exists, will be in group #16, with leading
   whitespace stripped.

   The list and option lines are slightly different, since the value is required
   and is not an identifier:

   ``(^)((([\"'])([a-zA-Z0-9_-]+)(?:\4))|([a-zA-Z0-9_-]+))(\s+)((([\"'])([^\\\10]*)(?:\10))|([^'\"\s#]+))((\s*)(#.*))?($)``

   If first field is quoted, this yields the list or option name in group #5.  If
   the first field is not quoted, this yields the list or option name in group #6.
   If second field is quoted, this yields the value in group #11.  If second field
   is not quoted, this yields the value in group #8.  The trailing comment, if it
   exists, will be in group #15, with trailing whitespace stripped.  The regular
   expression is careful to allow only embedded quotes of a different type, as
   discussed above.


   UCI Syntax Specification
   ========================

   *Note:* This section was taken from the OpenWRT UCI_ documentation.

   The UCI configuration files usually consist of one or more config statements,
   so called sections with one or more option statements defining the actual
   values.

   A ``#`` begins comments in the usual way. Specifically, if a line contains a
   ``#`` outside of a string literal, it and all characters after it in the line
   are considered a comment and ignored.

   Below is an example of a simple configuration file::

       package 'example'

       config 'example' 'test'
               option   'string'      'some value'
               option   'boolean'     '1'
               list     'collection'  'first item'
               list     'collection'  'second item'

   The config ``'example' 'test'`` statement defines the start of a section with
   the type ``example`` and the name ``test``. There can also be so called anonymous
   sections with only a type, but no name identifier. The type is important for
   the processing programs to decide how to treat the enclosed options.

   The option ``'string' 'some value'`` and option ``'boolean' '1'`` lines define simple
   values within the section. Note that there are no syntactical differences
   between text and boolean options. Per convention, boolean options may have one
   of the values `0``, ``no``, ``off``, ``false`` or ``disabled`` to specify a false value
   or ``1`` , ``yes``, ``on``, ``true`` or ``enabled`` to specify a true value.  In the
   lines starting with a `list` keyword an option with multiple values is defined.
   All `list` statements that share the same name, `collection` in our example, will
   be combined into a single list of values with the same order as in the
   configuration file.  The indentation of the `option` and `list` statements is a
   convention to improve the readability of the configuration file but it's not
   syntactically required.

   Usually you do not need to enclose identifiers or values in quotes. Quotes are
   only required if the enclosed value contains spaces or tabs. Also it's legal to
   use double- instead of single-quotes when typing configuration options.

   All of the examples below are valid UCI syntax::

       option  example   value
       option  example  "value"
       option 'example'  value
       option 'example' "value"
       option "example" 'value'

   In contrast, the following examples are not valid UCI syntax::

       # missing quotes around the value
       option  example   v_a l u-e
       # unbalanced quotes
       option 'example" "value'

   It is important to know that UCI identifiers and config file names may contain
   only the characters `a-zA-Z`, `0-9` and `_`. E.g. no hyphens (``-``) are allowed.
   Option values may contain any character (as long they are properly quoted).

   *(Editorial note: the statement above about identifiers is not accurate.  For
   instance, the ``/etc/config/wireless`` file uses configuration that looks like
   ``config wifi-device 'radio0'``, which clearly doesn't meet the spec.  We
   accept identifiers that include a dash, regardless of what the spec says.)*

   .. _UCI: https://openwrt.org/docs/guide-user/base-system/uci







Module Contents
---------------

.. py:exception:: UciParseError(message: str)

   Bases: :py:obj:`ValueError`


   Exception raised when a UCI file can't be parsed.


   .. py:attribute:: message


.. py:class:: UciLine

   Bases: :py:obj:`abc.ABC`


   A line in a UCI config file.


   .. py:method:: normalized() -> str
      :abstractmethod:


      Serialize the line in normalized form.



.. py:class:: UciPackageLine(name: str, comment: str | None = None)

   Bases: :py:obj:`UciLine`


   A package line in a UCI config file.


   .. py:attribute:: name


   .. py:attribute:: comment
      :value: None



   .. py:method:: normalized() -> str

      Serialize the line in normalized form.



.. py:class:: UciConfigLine(section: str, name: str | None = None, comment: str | None = None)

   Bases: :py:obj:`UciLine`


   A config line in a UCI config file.


   .. py:attribute:: section


   .. py:attribute:: name
      :value: None



   .. py:attribute:: comment
      :value: None



   .. py:method:: normalized() -> str

      Serialize the line in normalized form.



.. py:class:: UciOptionLine(name: str, value: str, comment: str | None = None)

   Bases: :py:obj:`UciLine`


   An option line in a UCI config file.


   .. py:attribute:: name


   .. py:attribute:: value


   .. py:attribute:: comment
      :value: None



   .. py:method:: normalized() -> str

      Serialize the line in normalized form.



.. py:class:: UciListLine(name: str, value: str, comment: str | None = None)

   Bases: :py:obj:`UciLine`


   A list line in a UCI config file.


   .. py:attribute:: name


   .. py:attribute:: value


   .. py:attribute:: comment
      :value: None



   .. py:method:: normalized() -> str

      Serialize the line in normalized form.



.. py:class:: UciCommentLine(comment: str, *, indented: bool = False)

   Bases: :py:obj:`UciLine`


   A comment line in a UCI config file.


   .. py:attribute:: comment


   .. py:attribute:: indented
      :value: False



   .. py:method:: normalized() -> str

      Serialize the line in normalized form.



.. py:class:: UciFile(lines: list[UciLine])

   .. py:attribute:: lines


   .. py:method:: normalized() -> list[str]

      Return a list of normalized lines comprising the file.



   .. py:method:: from_file(path: str | pathlib.Path) -> UciFile
      :staticmethod:


      Generate a UciFile from a file on disk.



   .. py:method:: from_fp(fp: TextIO) -> UciFile
      :staticmethod:


      Generate a UciFile from the contents of a file pointer.



   .. py:method:: from_text(text: str) -> UciFile
      :staticmethod:



   .. py:method:: from_lines(lines: collections.abc.Sequence[str]) -> UciFile
      :staticmethod:


      Generate a UciFile from a list of lines.



