Lib:Hackers Guide
From GNUpdf
Contents |
GNU PDF Library Hackers Guide
GNU PDF Library Hackers Guide
This is the GNU PDF Library Hackers Guide, updated for libgnupdf version 0.1.
Copyright © 2008 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
Use of Autotools
The build system of the GNU PDF software uses the GNU build utilities (otherwise known as the GNU Autotools). This chapter contains some guidelines to apply when incorporating changes into the codebase.
Third-party m4 macros
Sometimes it is useful to use third-party m4 macros provided by some
build dependency (such as libgcrypt that provides an
AM_PATH_LIBGCRYPT macro).
In that situation we are introducing a dependency in bootstrap time, and it is not desirable: the dependencies should be checked in configure time.
Any third-party m4 file should be copied in `libgnupdf/m4/' and put under version control. In that way we avoid the dependency in bootstrap time.
Coding Conventions
Like in any other GNU package, the code in the GNU PDF Library follows the coding conventions documented in the GNU Coding Standards.
In this section we complement the guidelines of the GHM with some specific conventions that we follow in the development of the library. It is quite important to follow these guidelines to maintain a good level of coherence in the codebase.
File Headers
The standard file header to be used in any source file in the library is the following:
/* -*- mode: C -*- Time-stamp: TIME_STAMP * * File: FILE_NAME * Date: CREATION_TIME * * GNU PDF Project - SHORT_DESCRIPTION * */ |
The entries in the template are:
- TIME_STAMP
This is a time stamp with the format:
YYYY-MM-DD hh:mm:ss nickname
Note that if you are writing your code using Emacs then you will get
the timestamp automatically updated each time you save the file.
- FILE_NAME
The basename of the file.
- CREATION_TIME
A time stamp string in the format:
Fri Feb 22 21:05:05 2008
Note that if you are writing your code using Emacs then you will get
the appropriate creation date running the
current-time-stringelisp command. If you are using thegnupdf-c-file-headerskeleton template then you will get the creation date in template-expansion time.- SHORT_DESCRIPTION
A one-sentence brief description of the contents of the file. This description should not exceed one physical line of text.
Spaces vs. Tabs
It is preferable to use blank characters instead of tabs to indent the source code: the interpretation of the actual width of a tab is up to the viewer program.
Please use blank characters when writing code to be included in GNU PDF software.
If you use Emacs you can tell it to insert spaces instead of tabs including:
(setq-default indent-tabs-mode nil) |
in your `.emacs'.
If you use GNU indent to indent your sources you can use the
--nut option:
$ indent --nut [rest-of-parameters] [source-files] |
Naming Functions
Public functions in a module
All the public functions inside a module should use the following name convention:
pdf_MODULE-NAME_... |
where module-name is the canonical name of the module
(e.g. alloc or text).
Some modules are composed by more than one compilation unit. In that case the public functions should follow the following name convention:
pdf_MODULE-NAME_PART-NAME_... |
where part-name is the canonical name of that part of the module
implementation (e.g. pdf_stm_filter_... where filter is
the part name.
Private functions in a module
The private ("static") functions used in a module implementation should follow the same naming conventions as the public ones.
Platform specific functions
The names for functions (both public and private) intended to be used if compiling for a specific platform should use the following name convention:
pdf_MDULE-NAME[_PART-NAME]_PLATFORM_... |
where platform is the canonical name for the target platform:
-
gnu For GNU systems.
-
posix For POSIX systems.
-
win32 For Windows systems.
-
macos For Macos X systems.
Abstract Data Types
The GNU PDF Library codebase is written using the C programming language. C does not support the notion of object as used in object-oriented programming.
Instead of objects we are using a kind of data-control abstraction called abstract data types. This abstraction provides high encapsulation of the implementation details of the data types and thus allow the definition of opaque types.
An <acronym>ADT</acronym> is composed by:
- A data structure containing the private data that characterizes each instance of the <acronym>ADT</acronym>.
-
A set of access functions that implement actions on the <acronym>ADT</acronym>.
Implementation Files For ADTs
Each Abstract Data Type shall be implemented in source files following this naming convention:
pdf-FOO-*.[ch] |
where FOO is the name of the <acronym>ADT</acronym>; for example, `pdf-text-context.c'.
A general header file for the <acronym>ADT</acronym> should always be present and should be named after:
pdf-FOO.h |
where FOO is again the name of the <acronym>ADT</acronym>; for example, `pdf-text.h'.
Data Structures For ADTs
There are two different approaches that shall be used to define the data structures containing the private data for an <acronym>ADT</acronym>:
- A pointer to a structure
In this case a C structure should be defined to hold the private data:
/* Definition of the pdf_foo_t ADT */ struct pdf_foo_s { int data_a; int data_b; };and then a typedef that defines
pdf_foo_tas a pointer to thatstructure:
typedef struct pdf_foo_s *pdf_foo_t;
- A structure
In this case a C structure (not a pointer to it) is used to represent the <acronym>ADT</acronym>:
typedef struct pdf_foo_s pdf_foo_t;
This alternative is indicated in the case where the private data of
the <acronym>ADT</acronym> is small, allowing the developer to allocate instances of the <acronym>ADT</acronym> in the stack and thus avoiding fragmentation of the heap.
Note that both alternatives allow to copy a reference using the C assignation operator, like in:
reference_to_adt_instance1 = adt_instance1; |
Access Functions For ADTs
Every access function implemented by an <acronym>ADT</acronym> should have a prototype conformant to the following convention:
RETURN_TYPE pdf_FOO_* (pdf_FOO_t adt, args...) |
where FOO is the name of the <acronym>ADT</acronym>.
The following standard functions shall be defined:
-
pdf_status_t pdf_FOO_new (args..., pdf_FOO_t *adt) This is the function used to create a new instance of the <acronym>ADT</acronym>. The last parameter of the function should be a pointer to a
pdf_FOO_tvalue. The returned status value should indicate the state of the operation.-
pdf_FOO_destroy (pdf_FOO_t adt) This is the function used to destroy an instance of the <acronym>ADT</acronym>. The memory occupied by the <acronym>ADT</acronym> data structure is freed.
Sending Patches
This chapter contains some useful information to send patches to be integrated in the trunk.
Generating a Bazaar Merge Directive
The Bazaar version control system supports the notion of merge directives. A merge directive is a kind of "superpatch" that contain an ascii-encoded binary block describing the patch (changes to file contents, addition of new files, etc) and a preview that is much like a regular diff.
A merge directive can be merged into a given branch much like any other branch.
To create a merge directive out of your bazaar branch just type the following command:
$ bzr send -o my-patch |
Then send the file `my-patch' in an email to pdf-devel@gnu.org in order to be reviewed by the development team.
Note that you dont need to specify extra parameters to the bzr
send command: it will use the appropriate format for the patch by
default (unidiff).
Syntax Check
The maintainer-makefile gnulib module provides some more make targets, useful for the maintainership of the package.
One of the targets is 'syntax-check'. It performs a check of common pitfalls on the source code and GCS conformance.
Please do a make syntax-check before to send a patch, or alternatively use the Patch Safety Dispatcher (see the next section).
Patch Safety Dispatcher
Before sending a patch to the list to be included in the trunk you can run the patch safety dispatcher, which is a script that runs a few more scripts, like the syntax check mentioned in this chapter.
In fact, the Patch Safety Dispatcher is a bzr plugin that is run before a commit is applied to your working copy. In order to execute it you need to tell bzr where the plugin is located. There are two ways to do it:
1. Copy the script located in "prmgt/patch-safety-dispatcher.py" at the projects root directory to your bazaar plugins directory "~/.bazaar/plugins".
2. Add the "prmgt" directory to the BZR_PLUGIN_PATH variable. For example, doing "export BZR_PLUGIN_PATH=/your/path/to/libgnupdf/prmgt" (alternatively you can add it to your ~/.bashrc).
After telling bzr where your plugins are, you can test it doing: "bzr hooks" (from the projects root directory) . You should find it in the list as "Patch safety scripts hook" in the pre_commit section.
That's all. Now when you do a "bzr commit" a small report will tell if your patch is correct in terms of the QA scripts we run daily. If it is the commit will be applied, otherwise it won't.
NOTE: Make sure you run "bzr commit" from your working copy root directory. Bazaar will fail with some error or don't even run the script otherwise. Until now we have no solution for this problem.
| Button | Name | Go to | From 1.2.3 go to |
|---|---|---|---|
| [ < ] | Back | previous section in reading order | 1.2.2 |
| [ > ] | Forward | next section in reading order | 1.2.4 |
| [ << ] | FastBack | beginning of this chapter or previous chapter | 1 |
| [ Up ] | Up | up section | 1.2 |
| [ >> ] | FastForward | next chapter | 2 |
| [Top] | Top | cover (top) of document | |
| [Contents] | Contents | table of contents | |
| [Index] | Index | index | |
| [ ? ] | About | about (help) |
where the Example assumes that the current position is at Subsubsection One-Two-Three of a document of the following structure:
- 1. Section One
- 1.1 Subsection One-One
- ...
- 1.2 Subsection One-Two
- 1.2.1 Subsubsection One-Two-One
- 1.2.2 Subsubsection One-Two-Two
- 1.2.3 Subsubsection One-Two-Three <== Current Position
- 1.2.4 Subsubsection One-Two-Four
- 1.3 Subsection One-Three
- ...
- 1.4 Subsection One-Four
- 1.1 Subsection One-One



