Build: Fix module sort order for PGXN installation

JIRA: MADLIB-1024

PGXN installation involves creating a single extension sql file that
contains all the SQL commands run during MADlib deployment. The modules
added into this extension file are to be placed in the right order,
taking dependencies into account.

MADlib has a function that compares a given file path with topologically
sorted modules to decide the order of concatenation to extension file.
This comparison is faulty since the module name is searched for in the
whole path, leading to false positive with modules that have another
module name as substring.  The specific bug was reported as 'svec_util'
being flagged in same order as 'svec'.

This commit fixes this issue taking advantage of the file path names being
of the form '.../modules/<module_name>/...', hence comparing the
whole module name.

Closes #106
1 file changed
tree: 958ea8833fa84f52a7ddeadaba7410124d0e2f6b
  1. cmake/
  2. deploy/
  3. doc/
  4. examples/
  5. licenses/
  6. methods/
  7. src/
  8. .gitignore
  9. CMakeLists.txt
  10. configure
  11. DISCLAIMER
  12. HAWQ_Install.txt
  13. LICENSE
  14. NOTICE
  15. pom.xml
  16. README.md
  17. ReadMe.txt
  18. ReadMe_Build.txt
  19. RELEASE_NOTES
README.md

MADlib® is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data.

Installation and Contribution

See the project webpage MADlib Home for links to the latest binary and source packages. For installation and contribution guides, please see MADlib Wiki

User and Developer Documentation

The latest documentation of MADlib modules can be found at MADlib Docs.

Architecture

The following block-diagram gives a high-level overview of MADlib's architecture.

MADlib Architecture

Third Party Components

MADlib incorporates material from the following third-party components

  1. argparse 1.2.1 “provides an easy, declarative interface for creating command line tools”
  2. Boost 1.47.0 (or newer) “provides peer-reviewed portable C++ source libraries”
  3. Eigen 3.2.2 “is a C++ template library for linear algebra”
  4. PyYAML 3.10 “is a YAML parser and emitter for Python”
  5. PyXB 1.2.4 “is a Python library for XML Schema Bindings”

Licensing

License information regarding MADlib and included third-party libraries can be found inside the license directory.

Release Notes

Changes between MADlib versions are described in the ReleaseNotes.txt file.

Papers and Talks

Related Software

  • PivotalR - PivotalR also lets the user run the functions of the open-source big-data machine learning package MADlib directly from R.
  • PyMADlib - PyMADlib is a python wrapper for MADlib, which brings you the power and flexibility of python with the number crunching power of MADlib.