| <!DOCTYPE html> |
| <!-- |
| | Generated by Apache Maven Doxia Site Renderer 1.8.1 from src/site/xdoc/poweredbyhbase.xml |
| | Rendered using Apache Maven Fluido Skin 1.7.1-HBase |
| --> |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head> |
| <meta charset="UTF-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
| <meta http-equiv="Content-Language" content="en" /> |
| <title>Apache HBase – Powered By Apache HBase</title> |
| <link rel="stylesheet" href="./css/apache-maven-fluido-1.7.1-HBase.min.css" /> |
| <link rel="stylesheet" href="./css/site.css" /> |
| <link rel="stylesheet" href="./css/print.css" media="print" /> |
| <script type="text/javascript" src="./js/apache-maven-fluido-1.7.1-HBase.min.js"></script> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"></meta> |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/2.3.2/css/bootstrap-responsive.min.css"/> |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.9.1/styles/github.min.css"/> |
| <link rel="stylesheet" href="css/site.css"/> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.9.1/highlight.min.js"></script> |
| </head> |
| <body class="topBarEnabled"> |
| <div id="topbar" class="navbar navbar-fixed-top "> |
| <div class="navbar-inner"> |
| <div class="container"> |
| <a data-target=".nav-collapse" data-toggle="collapse" class="btn btn-navbar"> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </a> |
| <div class="nav-collapse"> |
| <ul class="nav"> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Apache HBase Project <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="index.html" title="Overview">Overview</a></li> |
| <li><a href="https://www.apache.org/licenses/" title="License">License</a></li> |
| <li><a href="downloads.html" title="Downloads">Downloads</a></li> |
| <li><a href="https://issues.apache.org/jira/browse/HBASE?report=com.atlassian.jira.plugin.system.project:changelog-panel#selectedTab=com.atlassian.jira.plugin.system.project%3Achangelog-panel" title="Release Notes">Release Notes</a></li> |
| <li><a href="coc.html" title="Code Of Conduct">Code Of Conduct</a></li> |
| <li><a href="http://blogs.apache.org/hbase/" title="Blog">Blog</a></li> |
| <li><a href="mail-lists.html" title="Mailing Lists">Mailing Lists</a></li> |
| <li><a href="team-list.html" title="Team">Team</a></li> |
| <li><a href="https://reviews.apache.org/" title="ReviewBoard">ReviewBoard</a></li> |
| <li><a href="sponsors.html" title="HBase Sponsors">HBase Sponsors</a></li> |
| <li><a href="https://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> |
| <li><a href="poweredbyhbase.html" title="Powered by HBase">Powered by HBase</a></li> |
| <li><a href="resources.html" title="Other resources">Other resources</a></li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Project Information <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="project-summary.html" title="Project Summary">Project Summary</a></li> |
| <li><a href="dependency-info.html" title="Dependency Information">Dependency Information</a></li> |
| <li><a href="source-repository.html" title="Source Repository">Source Repository</a></li> |
| <li><a href="issue-tracking.html" title="Issue Tracking">Issue Tracking</a></li> |
| <li><a href="dependency-management.html" title="Dependency Management">Dependency Management</a></li> |
| <li><a href="dependencies.html" title="Dependencies">Dependencies</a></li> |
| <li><a href="dependency-convergence.html" title="Dependency Convergence">Dependency Convergence</a></li> |
| <li><a href="plugin-management.html" title="Plugin Management">Plugin Management</a></li> |
| <li><a href="plugins.html" title="Plugins">Plugins</a></li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation and API <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="book.html" target="_blank" title="Reference Guide">Reference Guide</a></li> |
| <li><a href="apache_hbase_reference_guide.pdf" target="_blank" title="Reference Guide (PDF)">Reference Guide (PDF)</a></li> |
| <li><a href="book.html#quickstart" target="_blank" title="Getting Started">Getting Started</a></li> |
| <li><a href="apidocs/index.html" target="_blank" title="User API">User API</a></li> |
| <li><a href="testapidocs/index.html" target="_blank" title="User API (Test)">User API (Test)</a></li> |
| <li><a href="devapidocs/index.html" target="_blank" title="Developer API">Developer API</a></li> |
| <li><a href="testdevapidocs/index.html" target="_blank" title="Developer API (Test)">Developer API (Test)</a></li> |
| <li><a href="http://abloz.com/hbase/book.html" target="_blank" title="中文参考指南(单页)">中文参考指南(单页)</a></li> |
| <li><a href="book.html#faq" target="_blank" title="FAQ">FAQ</a></li> |
| <li><a href="book.html#other.info" target="_blank" title="Videos/Presentations">Videos/Presentations</a></li> |
| <li><a href="https://cwiki.apache.org/confluence/display/HADOOP2/Hbase" target="_blank" title="Wiki">Wiki</a></li> |
| <li><a href="acid-semantics.html" target="_blank" title="ACID Semantics">ACID Semantics</a></li> |
| <li><a href="book.html#arch.bulk.load" target="_blank" title="Bulk Loads">Bulk Loads</a></li> |
| <li><a href="metrics.html" target="_blank" title="Metrics">Metrics</a></li> |
| <li><a href="book.html#replication" target="_blank" title="Cluster replication">Cluster replication</a></li> |
| <li class="dropdown-submenu"> |
| <a href="" title="1.4 Documentation">1.4 Documentation</a> |
| <ul class="dropdown-menu"> |
| <li><a href="1.4/book.html" target="_blank" title="Ref Guide">Ref Guide</a></li> |
| <li><a href="1.4/book.pdf" target="_blank" title="Reference Guide (PDF)">Reference Guide (PDF)</a></li> |
| <li><a href="1.4/apidocs/index.html" target="_blank" title="User API">User API</a></li> |
| <li><a href="1.4/devapidocs/index.html" target="_blank" title="Developer API">Developer API</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu"> |
| <a href="" title="2.2 Documentation">2.2 Documentation</a> |
| <ul class="dropdown-menu"> |
| <li><a href="2.2/book.html" target="_blank" title="Ref Guide">Ref Guide</a></li> |
| <li><a href="2.2/apache_hbase_reference_guide.pdf" target="_blank" title="Reference Guide (PDF)">Reference Guide (PDF)</a></li> |
| <li><a href="2.2/apidocs/index.html" target="_blank" title="User API">User API</a></li> |
| <li><a href="2.2/testapidocs/index.html" target="_blank" title="User API (Test)">User API (Test)</a></li> |
| <li><a href="2.2/devapidocs/index.html" target="_blank" title="Developer API">Developer API</a></li> |
| <li><a href="2.2/testdevapidocs/index.html" target="_blank" title="Developer API (Test)">Developer API (Test)</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu"> |
| <a href="" title="2.3 Documentation">2.3 Documentation</a> |
| <ul class="dropdown-menu"> |
| <li><a href="2.3/book.html" target="_blank" title="Ref Guide">Ref Guide</a></li> |
| <li><a href="2.3/apache_hbase_reference_guide.pdf" target="_blank" title="Reference Guide (PDF)">Reference Guide (PDF)</a></li> |
| <li><a href="2.3/apidocs/index.html" target="_blank" title="User API">User API</a></li> |
| <li><a href="2.3/testapidocs/index.html" target="_blank" title="User API (Test)">User API (Test)</a></li> |
| <li><a href="2.3/devapidocs/index.html" target="_blank" title="Developer API">Developer API</a></li> |
| <li><a href="2.3/testdevapidocs/index.html" target="_blank" title="Developer API (Test)">Developer API (Test)</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">ASF <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="http://www.apache.org/foundation/" target="_blank" title="Apache Software Foundation">Apache Software Foundation</a></li> |
| <li><a href="http://www.apache.org/foundation/how-it-works.html" target="_blank" title="How Apache Works">How Apache Works</a></li> |
| <li><a href="http://www.apache.org/foundation/sponsorship.html" target="_blank" title="Sponsoring Apache">Sponsoring Apache</a></li> |
| </ul> |
| </li> |
| </ul> |
| <div id="search-form" class="navbar-search pull-right"> |
| <script type="text/javascript"> |
| var cx = '000385458301414556862:sq1bb0xugjg'; |
| |
| (function() { |
| var gcse = document.createElement('script'); |
| gcse.type = 'text/javascript'; |
| gcse.async = true; |
| gcse.src = 'https://cse.google.com/cse.js?cx=' + cx; |
| var s = document.getElementsByTagName('script')[0]; |
| s.parentNode.insertBefore(gcse, s); |
| })(); |
| |
| </script> |
| <gcse:search></gcse:search> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="container"> |
| <div id="banner"> |
| <div class="pull-left"><a href="./" id="bannerLeft"><img src="" alt=""/></a></div> |
| <div class="pull-right"><a href="http://hbase.apache.org/" id="bannerRight"><img src="images/hbase_logo_with_orca_large.png" alt="Apache HBase"/></a></div> |
| <div class="clear"><hr/></div> |
| </div> |
| |
| <div id="breadcrumbs"> |
| <ul class="breadcrumb"> |
| </ul> |
| </div> |
| <div id="bodyColumn" > |
| |
| |
| <div class="section"> |
| <h2><a name="Powered_By_Apache_HBase.C2.99"></a>Powered By Apache HBase™</h2> |
| |
| <p>This page lists some institutions and projects which are using HBase. To |
| have your organization added, file a documentation JIRA or email |
| <a class="externalLink" href="mailto:dev@hbase.apache.org">hbase-dev</a> with the relevant |
| information. If you notice out-of-date information, use the same avenues to |
| report it. |
| </p> |
| |
| <p><b>These items are user-submitted and the HBase team assumes no responsibility for their accuracy.</b></p> |
| |
| <dl> |
| |
| <dt><a class="externalLink" href="http://www.adobe.com">Adobe</a></dt> |
| |
| <dd>We currently have about 30 nodes running HDFS, Hadoop and HBase in clusters |
| ranging from 5 to 14 nodes on both production and development. We plan a |
| deployment on an 80 nodes cluster. We are using HBase in several areas from |
| social services to structured data and processing for internal use. We constantly |
| write data to HBase and run mapreduce jobs to process then store it back to |
| HBase or external systems. Our production cluster has been running since Oct 2008.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase">Project Astro</a></dt> |
| |
| <dd> |
| Astro provides fast Spark SQL/DataFrame capabilities to HBase data, |
| featuring super-efficient access to multi-dimensional HBase rows through |
| native Spark execution in HBase coprocessor plus systematic and accurate |
| partition pruning and predicate pushdown from arbitrarily complex data |
| filtering logic. The batch load is optimized to run on the Spark execution |
| engine. Note that <a class="externalLink" href="http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase">Spark-SQL-on-HBase</a> |
| is the release site. Interested parties are free to make clones and claim |
| to be "latest(and active)", but they are not endorsed by the owner. |
| </dd> |
| |
| |
| <dt><a class="externalLink" href="http://axibase.com/products/axibase-time-series-database/">Axibase |
| Time Series Database (ATSD)</a></dt> |
| |
| <dd>ATSD runs on top of HBase to collect, analyze and visualize time series |
| data at scale. ATSD capabilities include optimized storage schema, built-in |
| rule engine, forecasting algorithms (Holt-Winters and ARIMA) and next-generation |
| graphics designed for high-frequency data. Primary use cases: IT infrastructure |
| monitoring, data consolidation, operational historian in OPC environments.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.benipaltechnologies.com">Benipal Technologies</a></dt> |
| |
| <dd>We have a 35 node cluster used for HBase and Mapreduce with Lucene / SOLR |
| and katta integration to create and finetune our search databases. Currently, |
| our HBase installation has over 10 Billion rows with 100s of datapoints per row. |
| We compute over 10<sup>18</sup> calculations daily using MapReduce directly on HBase. We |
| heart HBase.</dd> |
| |
| |
| <dt><a class="externalLink" href="https://github.com/ermanpattuk/BigSecret">BigSecret</a></dt> |
| |
| <dd>BigSecret is a security framework that is designed to secure Key-Value data, |
| while preserving efficient processing capabilities. It achieves cell-level |
| security, using combinations of different cryptographic techniques, in an |
| efficient and secure manner. It provides a wrapper library around HBase.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://caree.rs">Caree.rs</a></dt> |
| |
| <dd>Accelerated hiring platform for HiTech companies. We use HBase and Hadoop |
| for all aspects of our backend - job and company data storage, analytics |
| processing, machine learning algorithms for our hire recommendation engine. |
| Our live production site is directly served from HBase. We use cascading for |
| running offline data processing jobs.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.celer-tech.com/">Celer Technologies</a></dt> |
| |
| <dd>Celer Technologies is a global financial software company that creates |
| modular-based systems that have the flexibility to meet tomorrow's business |
| environment, today. The Celer framework uses Hadoop/HBase for storing all |
| financial data for trading, risk, clearing in a single data store. With our |
| flexible framework and all the data in Hadoop/HBase, clients can build new |
| features to quickly extract data based on their trading, risk and clearing |
| activities from one single location.</dd> |
| |
| |
| <dt><a class="externalLink" href="https://esgyn.com/">EsgynDB</a></dt> |
| |
| <dd>EsgynDB, powered by Apache Trafodion™, provides enterprise SQL on Hadoop. |
| It includes full ACID transactions, online transaction processing and online |
| analytic processing, along with enterprise features such as disaster recovery |
| and full backup/restore. Native tables are stored in HBase, but read and write |
| access to various other file formats such as Apache Parquet and ORC is also supported. </dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.explorys.net">Explorys</a></dt> |
| |
| <dd>Explorys uses an HBase cluster containing over a billion anonymized clinical |
| records, to enable subscribers to search and analyze patient populations, |
| treatment protocols, and clinical outcomes.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.facebook.com/notes/facebook-engineering/the-underlying-technology-of-messages/454991608919">Facebook</a></dt> |
| |
| <dd>Facebook uses HBase to power their Messages infrastructure.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.filmweb.pl">Filmweb</a></dt> |
| |
| <dd>Filmweb is a film web portal with a large dataset of films, persons and |
| movie-related entities. We have just started a small cluster of 3 HBase nodes |
| to handle our web cache persistency layer. We plan to increase the cluster |
| size, and also to start migrating some of the data from our databases which |
| have some demanding scalability requirements.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.flurry.com">Flurry</a></dt> |
| |
| <dd>Flurry provides mobile application analytics. We use HBase and Hadoop for |
| all of our analytics processing, and serve all of our live requests directly |
| out of HBase on our 50 node production cluster with tens of billions of rows |
| over several tables.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://gumgum.com">GumGum</a></dt> |
| |
| <dd>GumGum is an In-Image Advertising Platform. We use HBase on an 15-node |
| Amazon EC2 High-CPU Extra Large (c1.xlarge) cluster for both real-time data |
| and analytics. Our production cluster has been running since June 2010.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://helprace.com/help-desk/">Helprace</a></dt> |
| |
| <dd>Helprace is a customer service platform which uses Hadoop for analytics |
| and internal searching and filtering. Being on HBase we can share our HBase |
| and Hadoop cluster with other Hadoop processes - this particularly helps in |
| keeping community speeds up. We use Hadoop and HBase on small cluster with 4 |
| cores and 32 GB RAM each.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://hubspot.com">HubSpot</a></dt> |
| |
| <dd>HubSpot is an online marketing platform, providing analytics, email, and |
| segmentation of leads/contacts. HBase is our primary datastore for our customers' |
| customer data, with multiple HBase clusters powering the majority of our |
| product. We have nearly 200 regionservers across the various clusters, and |
| 2 hadoop clusters also with nearly 200 tasktrackers. We use c1.xlarge in EC2 |
| for both, but are starting to move some of that to baremetal hardware. We've |
| been running HBase for over 2 years.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.infolinks.com/">Infolinks</a></dt> |
| |
| <dd>Infolinks is an In-Text ad provider. We use HBase to process advertisement |
| selection and user events for our In-Text ad network. The reports generated |
| from HBase are used as feedback for our production system to optimize ad |
| selection.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.kalooga.com">Kalooga</a></dt> |
| |
| <dd>Kalooga is a discovery service for image galleries. We use Hadoop, HBase |
| and Pig on a 20-node cluster for our crawling, analysis and events |
| processing.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.leanxcale.com/">LeanXcale</a></dt> |
| |
| <dd>LeanXcale provides an ultra-scalable transactional & SQL database that |
| stores its data on HBase and it is able to scale to 1000s of nodes. It |
| also provides a standalone full ACID HBase with transactions across |
| arbitrary sets of rows and tables.</dd> |
| |
| |
| |
| <dt><a class="externalLink" href="http://www.mahalo.com">Mahalo</a></dt> |
| |
| <dd>Mahalo, "...the world's first human-powered search engine". All the markup |
| that powers the wiki is stored in HBase. It's been in use for a few months now. |
| MediaWiki - the same software that power Wikipedia - has version/revision control. |
| Mahalo's in-house editors produce a lot of revisions per day, which was not |
| working well in a RDBMS. An hbase-based solution for this was built and tested, |
| and the data migrated out of MySQL and into HBase. Right now it's at something |
| like 6 million items in HBase. The upload tool runs every hour from a shell |
| script to back up that data, and on 6 nodes takes about 5-10 minutes to run - |
| and does not slow down production at all.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.meetup.com">Meetup</a></dt> |
| |
| <dd>Meetup is on a mission to help the world’s people self-organize into local |
| groups. We use Hadoop and HBase to power a site-wide, real-time activity |
| feed system for all of our members and groups. Group activity is written |
| directly to HBase, and indexed per member, with the member's custom feed |
| served directly from HBase for incoming requests. We're running HBase |
| 0.20.0 on a 11 node cluster.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.mendeley.com">Mendeley</a></dt> |
| |
| <dd>Mendeley is creating a platform for researchers to collaborate and share |
| their research online. HBase is helping us to create the world's largest |
| research paper collection and is being used to store all our raw imported data. |
| We use a lot of map reduce jobs to process these papers into pages displayed |
| on the site. We also use HBase with Pig to do analytics and produce the article |
| statistics shown on the web site. You can find out more about how we use HBase |
| in the <a class="externalLink" href="http://www.slideshare.net/danharvey/hbase-at-mendeley">HBase |
| At Mendeley</a> slide presentation.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.ngdata.com">NGDATA</a></dt> |
| |
| <dd>NGDATA delivers <a class="externalLink" href="http://www.ngdata.com/site/products/lily.html">Lily</a>, |
| the consumer intelligence solution that delivers a unique combination of Big |
| Data management, machine learning technologies and consumer intelligence |
| applications in one integrated solution to allow better, and more dynamic, |
| consumer insights. Lily allows companies to process and analyze massive structured |
| and unstructured data, scale storage elastically and locate actionable data |
| quickly from large data sources in near real time.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://ning.com">Ning</a></dt> |
| |
| <dd>Ning uses HBase to store and serve the results of processing user events |
| and log files, which allows us to provide near-real time analytics and |
| reporting. We use a small cluster of commodity machines with 4 cores and 16GB |
| of RAM per machine to handle all our analytics and reporting needs.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.worldcat.org">OCLC</a></dt> |
| |
| <dd>OCLC uses HBase as the main data store for WorldCat, a union catalog which |
| aggregates the collections of 72,000 libraries in 112 countries and territories. |
| WorldCat is currently comprised of nearly 1 billion records with nearly 2 |
| billion library ownership indications. We're running a 50 Node HBase cluster |
| and a separate offline map-reduce cluster.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://olex.openlogic.com">OpenLogic</a></dt> |
| |
| <dd>OpenLogic stores all the world's Open Source packages, versions, files, |
| and lines of code in HBase for both near-real-time access and analytical |
| purposes. The production cluster has well over 100TB of disk spread across |
| nodes with 32GB+ RAM and dual-quad or dual-hex core CPU's.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.openplaces.org">Openplaces</a></dt> |
| |
| <dd>Openplaces is a search engine for travel that uses HBase to store terabytes |
| of web pages and travel-related entity records (countries, cities, hotels, |
| etc.). We have dozens of MapReduce jobs that crunch data on a daily basis. |
| We use a 20-node cluster for development, a 40-node cluster for offline |
| production processing and an EC2 cluster for the live web site.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.pnl.gov">Pacific Northwest National Laboratory</a></dt> |
| |
| <dd>Hadoop and HBase (Cloudera distribution) are being used within PNNL's |
| Computational Biology & Bioinformatics Group for a systems biology data |
| warehouse project that integrates high throughput proteomics and transcriptomics |
| data sets coming from instruments in the Environmental Molecular Sciences |
| Laboratory, a US Department of Energy national user facility located at PNNL. |
| The data sets are being merged and annotated with other public genomics |
| information in the data warehouse environment, with Hadoop analysis programs |
| operating on the annotated data in the HBase tables. This work is hosted by |
| <a class="externalLink" href="http://www.pnl.gov/news/release.aspx?id=908">olympus</a>, a large PNNL |
| institutional computing cluster, with the HBase tables being stored in olympus's |
| Lustre file system.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.readpath.com/">ReadPath</a></dt> |
| |
| <dd>|ReadPath uses HBase to store several hundred million RSS items and dictionary |
| for its RSS newsreader. Readpath is currently running on an 8 node cluster.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://resu.me/">resu.me</a></dt> |
| |
| <dd>Career network for the net generation. We use HBase and Hadoop for all |
| aspects of our backend - user and resume data storage, analytics processing, |
| machine learning algorithms for our job recommendation engine. Our live |
| production site is directly served from HBase. We use cascading for running |
| offline data processing jobs.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.runa.com/">Runa Inc.</a></dt> |
| |
| <dd>Runa Inc. offers a SaaS that enables online merchants to offer dynamic |
| per-consumer, per-product promotions embedded in their website. To implement |
| this we collect the click streams of all their visitors to determine along |
| with the rules of the merchant what promotion to offer the visitor at different |
| points of their browsing the Merchant website. So we have lots of data and have |
| to do lots of off-line and real-time analytics. HBase is the core for us. |
| We also use Clojure and our own open sourced distributed processing framework, |
| Swarmiji. The HBase Community has been key to our forward movement with HBase. |
| We're looking for experienced developers to join us to help make things go even |
| faster!</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.sematext.com/">Sematext</a></dt> |
| |
| <dd>Sematext runs |
| <a class="externalLink" href="http://www.sematext.com/search-analytics/index.html">Search Analytics</a>, |
| a service that uses HBase to store search activity and MapReduce to produce |
| reports showing user search behaviour and experience. Sematext runs |
| <a class="externalLink" href="http://www.sematext.com/spm/index.html">Scalable Performance Monitoring (SPM)</a>, |
| a service that uses HBase to store performance data over time, crunch it with |
| the help of MapReduce, and display it in a visually rich browser-based UI. |
| Interestingly, SPM features |
| <a class="externalLink" href="http://www.sematext.com/spm/hbase-performance-monitoring/index.html">SPM for HBase</a>, |
| which is specifically designed to monitor all HBase performance metrics.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.socialmedia.com/">SocialMedia</a></dt> |
| |
| <dd>SocialMedia uses HBase to store and process user events which allows us to |
| provide near-realtime user metrics and reporting. HBase forms the heart of |
| our Advertising Network data storage and management system. We use HBase as |
| a data source and sink for both realtime request cycle queries and as a |
| backend for mapreduce analysis.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.splicemachine.com/">Splice Machine</a></dt> |
| |
| <dd>Splice Machine is built on top of HBase. Splice Machine is a full-featured |
| ANSI SQL database that provides real-time updates, secondary indices, ACID |
| transactions, optimized joins, triggers, and UDFs.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.streamy.com/">Streamy</a></dt> |
| |
| <dd>Streamy is a recently launched realtime social news site. We use HBase |
| for all of our data storage, query, and analysis needs, replacing an existing |
| SQL-based system. This includes hundreds of millions of documents, sparse |
| matrices, logs, and everything else once done in the relational system. We |
| perform significant in-memory caching of query results similar to a traditional |
| Memcached/SQL setup as well as other external components to perform joining |
| and sorting. We also run thousands of daily MapReduce jobs using HBase tables |
| for log analysis, attention data processing, and feed crawling. HBase has |
| helped us scale and distribute in ways we could not otherwise, and the |
| community has provided consistent and invaluable assistance.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.stumbleupon.com/">Stumbleupon</a></dt> |
| |
| <dd>Stumbleupon and <a class="externalLink" href="http://su.pr">Su.pr</a> use HBase as a real time |
| data storage and analytics platform. Serving directly out of HBase, various site |
| features and statistics are kept up to date in a real time fashion. We also |
| use HBase a map-reduce data source to overcome traditional query speed limits |
| in MySQL.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.tokenizer.org">Shopping Engine at Tokenizer</a></dt> |
| |
| <dd>Shopping Engine at Tokenizer is a web crawler; it uses HBase to store URLs |
| and Outlinks (AnchorText + LinkedURL): more than a billion. It was initially |
| designed as Nutch-Hadoop extension, then (due to very specific 'shopping' |
| scenario) moved to SOLR + MySQL(InnoDB) (ten thousands queries per second), |
| and now - to HBase. HBase is significantly faster due to: no need for huge |
| transaction logs, column-oriented design exactly matches 'lazy' business logic, |
| data compression, !MapReduce support. Number of mutable 'indexes' (term from |
| RDBMS) significantly reduced due to the fact that each 'row::column' structure |
| is physically sorted by 'row'. MySQL InnoDB engine is best DB choice for |
| highly-concurrent updates. However, necessity to flash a block of data to |
| harddrive even if we changed only few bytes is obvious bottleneck. HBase |
| greatly helps: not-so-popular in modern DBMS 'delete-insert', 'mutable primary |
| key', and 'natural primary key' patterns become a big advantage with HBase.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://traackr.com/">Traackr</a></dt> |
| |
| <dd>Traackr uses HBase to store and serve online influencer data in real-time. |
| We use MapReduce to frequently re-score our entire data set as we keep updating |
| influencer metrics on a daily basis.</dd> |
| |
| |
| <dt><a class="externalLink" href="https://trafodion.apache.org/">Trafodion</a></dt> |
| |
| <dd>Apache Trafodion™ is a webscale SQL-on-Hadoop solution enabling transactional |
| or operational workloads. It uses HBase as its storage engine for SQL tables.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://trendmicro.com/">Trend Micro</a></dt> |
| |
| <dd>Trend Micro uses HBase as a foundation for cloud scale storage for a variety |
| of applications. We have been developing with HBase since version 0.1 and |
| production since version 0.20.0.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.twitter.com">Twitter</a></dt> |
| |
| <dd>Twitter runs HBase across its entire Hadoop cluster. HBase provides a |
| distributed, read/write backup of all mysql tables in Twitter's production |
| backend, allowing engineers to run MapReduce jobs over the data while maintaining |
| the ability to apply periodic row updates (something that is more difficult |
| to do with vanilla HDFS). A number of applications including people search |
| rely on HBase internally for data generation. Additionally, the operations |
| team uses HBase as a timeseries database for cluster-wide monitoring/performance |
| data.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.udanax.org">Udanax.org</a></dt> |
| |
| <dd>Udanax.org is a URL shortener which use 10 nodes HBase cluster to store URLs, |
| Web Log data and response the real-time request on its Web Server. This |
| application is now used for some twitter clients and a number of web sites. |
| Currently API requests are almost 30 per second and web redirection requests |
| are about 300 per second.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.veoh.com/">Veoh Networks</a></dt> |
| |
| <dd>Veoh Networks uses HBase to store and process visitor (human) and entity |
| (non-human) profiles which are used for behavioral targeting, demographic |
| detection, and personalization services. Our site reads this data in |
| real-time (heavily cached) and submits updates via various batch map/reduce |
| jobs. With 25 million unique visitors a month storing this data in a traditional |
| RDBMS is not an option. We currently have a 24 node Hadoop/HBase cluster and |
| our profiling system is sharing this cluster with our other Hadoop data |
| pipeline processes.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.videosurf.com/">VideoSurf</a></dt> |
| |
| <dd>VideoSurf - "The video search engine that has taught computers to see". |
| We're using HBase to persist various large graphs of data and other statistics. |
| HBase was a real win for us because it let us store substantially larger |
| datasets without the need for manually partitioning the data and its |
| column-oriented nature allowed us to create schemas that were substantially |
| more efficient for storing and retrieving data.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.visibletechnologies.com/">Visible Technologies</a></dt> |
| |
| <dd>Visible Technologies uses Hadoop, HBase, Katta, and more to collect, parse, |
| store, and search hundreds of millions of Social Media content. We get incredibly |
| fast throughput and very low latency on commodity hardware. HBase enables our |
| business to exist.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.worldlingo.com/">WorldLingo</a></dt> |
| |
| <dd>The WorldLingo Multilingual Archive. We use HBase to store millions of |
| documents that we scan using Map/Reduce jobs to machine translate them into |
| all or selected target languages from our set of available machine translation |
| languages. We currently store 12 million documents but plan to eventually |
| reach the 450 million mark. HBase allows us to scale out as we need to grow |
| our storage capacities. Combined with Hadoop to keep the data replicated and |
| therefore fail-safe we have the backbone our service can rely on now and in |
| the future. !WorldLingo is using HBase since December 2007 and is along with |
| a few others one of the longest running HBase installation. Currently we are |
| running the latest HBase 0.20 and serving directly from it at |
| <a class="externalLink" href="http://www.worldlingo.com/ma/enwiki/en/HBase">MultilingualArchive</a>.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.yahoo.com/">Yahoo!</a></dt> |
| |
| <dd>Yahoo! uses HBase to store document fingerprint for detecting near-duplications. |
| We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The table |
| contains millions of rows. We use this for querying duplicated documents with |
| realtime traffic.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://h50146.www5.hp.com/products/software/security/icewall/eng/">HP IceWall SSO</a></dt> |
| |
| <dd>HP IceWall SSO is a web-based single sign-on solution and uses HBase to store |
| user data to authenticate users. We have supported RDB and LDAP previously but |
| have newly supported HBase with a view to authenticate over tens of millions |
| of users and devices.</dd> |
| |
| |
| <dt><a class="externalLink" href="http://www.ymc.ch/en/big-data-analytics-en?utm_source=hadoopwiki&utm_medium=poweredbypage&utm_campaign=ymc.ch">YMC AG</a></dt> |
| |
| <dd> |
| <ul> |
| |
| <li>operating a Cloudera Hadoop/HBase cluster for media monitoring purpose</li> |
| |
| <li>offering technical and operative consulting for the Hadoop stack + ecosystem</li> |
| |
| <li>editor of <a class="externalLink" href="http://www.ymc.ch/en/hbase-split-visualisation-introducing-hannibal?utm_source=hadoopwiki&utm_medium=poweredbypageamp;utm_campaign=ymc.ch">Hannibal</a>, a open-source tool |
| to visualize HBase regions sizes and splits that helps running HBase in production</li> |
| </ul></dd> |
| </dl> |
| </div> |
| |
| |
| </div> |
| </div> |
| <hr/> |
| <footer> |
| <div class="container"> |
| <div class="row"> |
| <p>Copyright ©2007–2020 |
| <a href="https://www.apache.org/">The Apache Software Foundation</a>. |
| All rights reserved. <li id="publishDate" class="pull-right">Last Published: 2020-09-19</li> |
| </p> |
| </div> |
| <p id="poweredBy" class="pull-right"><a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /></a> |
| </p> |
| </div> |
| </footer> |
| </body> |
| </html> |