Ruby on Rails Development Using Mongoid 5.0.0 - 3. How to Use MapReduce to Get Pageview Data

What is MapReduce?

Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results.

MapReduce is a popular big data term in recent years proposed by Google. It is a method for manipulate large data sets parallelly and distributedly on many machines. In my words, I usually said that Map-Reduce, “Map” is to assign match function to many machines for cutting a huge data into small data sets(group matched data), and then use “Reduce” to aggregate these calculated data.

In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further condense or process the results of the aggregation.

MongoDB MapReduce

if you want more details, please check official documents

Let’s go check how to use map reduce in Mongoid.

Read on →
Comments

Ruby on Rails Development Using Mongoid 5.0.0 - 2. How to Use Aggregation to Get Pageview Data

What is Aggregation ?

Aggregations are operations that process data records and return computed results.

In my opinion, it just like queries but it can do some operations step by step when processing the query just like a pipeline.

MongoDB Aggregation

if you want more details, please check official documents

Read on →
Comments

[HowTo] Integrate Google Analytics and Google AdWords for Customize Ads Retargeting

Comments

Ruby on Rails Development Using Mongoid 5.0.0 - 1. Setup MongoDB

MongoDB

This tutorial series will help you start your Rails project with MongoDB.

And I use Mongoid 5.0.0 as an example.

In this tutorial, you will be able to see how to

  1. Install MongoDB in Mac OSX

  2. Create Some Database Users in MongoDB

  3. Setup Rails Projects

(Tips)

  1. Dump Data

  2. Restore Data

Let’s go !

Read on →
Comments

[Tips] Limit Class Only Be Used in Development Environment

In my project, I have to dump data from Mysql to MongoDB. And, this feature only used in development. However, Rails will load any classes under folder “app/” automatically . So, how can I avoid “Development-Only Class” loading in production environment?

let’s check it out~

If you have a class named: DumpDataFromOldServer and it located at folder ‘/app/development_only’. like this:

1
2
3
4
5
# app/development_only/dump_data_from_old_server.rb

class DumpDataFromOldServer < ActiveRecord::Base
    ....
end

Step1. Application.rb

you still need to let Rails know which folders should be loaded by using config.eager_load_paths.

ps. what’s difference between eager_load and auto_load?

please check this http://stackoverflow.com/questions/19773266/confusing-about-autoload-paths-vs-eager-load-paths-in-rails-4

1
 config.eager_load_paths += %W(#{Rails.root}/app/development_only)

It will be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# config/application.rb

require File.expand_path('../boot', __FILE__)

Bundler.require(*Rails.groups)

module MyAwesomeProject
  class Application < Rails::Application
   ...
   
   config.eager_load_paths += %W(#{Rails.root}/app/development_only)
   
   ...
   end 
end

Step2. Update your environment file

ref: http://stackoverflow.com/questions/13756986/how-to-blacklist-directory-loading-in-rails

Add these lines

1
2
3
4
5
6
7
  path_rejector = lambda { |s| s.include?("app/development_only") }

  # Remove the path from being loaded when Rails starts:
  config.eager_load_paths = config.eager_load_paths.reject(&path_rejector)

  # Remove the path from being lazily loaded
  ActiveSupport::Dependencies.autoload_paths.reject!(&path_rejector)

It will be:

1
2
3
4
5
6
7
8
9
10
11
# config/environments/production.rb
Rails.application.configure do
  
  ...

  path_rejector = lambda { |s| s.include?("app/development_only") }

  config.eager_load_paths = config.eager_load_paths.reject(&path_rejector)

  ActiveSupport::Dependencies.autoload_paths.reject!(&path_rejector)
end

Step3. Test

1
2
3
4
5
6
7
8
9
10
11
# In Development console: rails c 

> DumpDataFromOldServer.all.first 

DumpDataFromOldServer Load (0.6ms)  SELECT  `dump_data_from_old_server`.* FROM `dump_data_from_old_server `  LIMIT 1
=> #<DumpDataFromOldServer id: 1, created_at: "2014-05-09 16:59:55">


# In Production console: rails c -e production
> DumpDataFromOldServer.all.first
NameError: uninitialized constant DumpDataFromOldServer
Comments

Build Up a Homemade Server-to-server Interactions With Google API Using Ruby - Take Google Analytics as an Example

Since ‘google-api-ruby-client’ make a big change from 0.8 to 0.9 when my project is running out of time, and also I met this issue, So, I make this decision, let’s build a API client by myself.

This implementation refer to Google guideline

Read on →
Google Analytics Alternative