Useful Data Scripts

Generating Company Name From Keywords

Keyword.where.not(:company_name => nil).each do |x|
  company_name = Company.find_or_create_by(name: x["company_name"])
  x.update(:company_id => company_name.id)
  puts "#{company_name.id} #{company_name.name}"
end

Keyword.joins(:company).count
Company.all.pluck(:name)
Company.all.pluck(:name).count

Generating Category Name From Keywords

Keyword.where.not(:category_name => nil).each do |x|
  category_name = Category.find_or_create_by(display_name: x["category_name"])
  x.update(:category_id => category_name.id)
  puts "#{category_name.id} #{category_name.display_name}"
end

Keyword.joins(:category).count
Category.all.pluck(:display_name)
Category.all.pluck(:display_name).count

Tag Text With Categories

Create a json mapping file of all of your tags.

If all your tags are already part of a categories table, you can generate this file by running:

category_tags_hash = Category.pluck(:display_name).map{ |k| [k.downcase,k.downcase] }.to_h
f = File.new(Rails.root + 'db/fixtures/category_tags.json', 'w')
f << JSON.pretty_generate(category_tags_hash.as_json)
f.close

The file should look similar to the example below where the key will be the keyword we're looking for and the value will be the category tag.

{
  "software engineer": "developer",
  "software developer": "developer",
  "programmer": "developer",
  "designer": "designer",
  "web design": "designer"
}

Scan text and output tags

First, create a regex

category_tags_json = JSON.parse(File.read(Rails.root + 'db/fixtures/category_tags.json'))
regex = /(#{category_tags_json.keys.join(" | ")})/i

Next, specify the object to scan. You can replace self with an object such as User.first

input_object = self 
input_text = input_object.description
### Strip html tags
input_text = ActionView::Base.full_sanitizer.sanitize(input_text)

Scan the text and return a tag list of matched keys

@tag_list = input_text.scan(regex).flatten
@tag_list = @tag_list.map { |x| Hash[category_tags_json][x.strip.downcase] }
@tag_list = @tag_list.uniq

Associate the tag list with tags ids

category_tags_ids = Category.pluck(:display_name, :id).map{ |k, v| [k.downcase,v] }.to_h
@tag_ids = @tag_list.map { |x| Hash[category_tags_ids][x.strip.downcase] }

Save the tag ids to the object

input_object.update(:category_ids => input_object.category_ids + @tag_ids)

Combining arrays