OQMD2PyChemiaDB

From WVU Materials Discovery Group
Jump to: navigation, search

Recreating the OQMD database and PyChemiaDB

Install MariaDB

If the system is running RHEL/CentOS The commands below assume that you have become root, otherwise use sudo before each command We will install also mariadb-devel as it is needed to install one of the dependencies of qmpy

sudo yum install mariadb-server mariadb mariadb-test mariadb-devel

Activate MariaDB service

To make the service available inmediately as well as in any future restart of the machine

systemctl start mariadb.service
systemctl enable mariadb.service

Check the service is actually running

systemctl status mariadb.service

The answer should be something like:

● mariadb.service - MariaDB database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-08-14 20:46:09 EDT; 31s ago
 Main PID: 8439 (mysqld_safe)
   CGroup: /system.slice/mariadb.service
           ├─8439 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
           └─8596 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mariadb/mariadb.log --pid-file=/var/run/mariadb/mariadb.pid --socket=/va...

Aug 14 20:46:07 mdg16.wvu.edu mariadb-prepare-db-dir[8357]: The latest information about MariaDB is available at http://mariadb.org/.
Aug 14 20:46:07 mdg16.wvu.edu mariadb-prepare-db-dir[8357]: You can find additional information about the MySQL part at:
Aug 14 20:46:07 mdg16.wvu.edu mariadb-prepare-db-dir[8357]: http://dev.mysql.com
Aug 14 20:46:07 mdg16.wvu.edu mariadb-prepare-db-dir[8357]: Support MariaDB development by buying support/new features from MariaDB
Aug 14 20:46:07 mdg16.wvu.edu mariadb-prepare-db-dir[8357]: Corporation Ab. You can contact us about this at sales@mariadb.com.
Aug 14 20:46:07 mdg16.wvu.edu mariadb-prepare-db-dir[8357]: Alternatively consider joining our community based development effort:
Aug 14 20:46:07 mdg16.wvu.edu mariadb-prepare-db-dir[8357]: http://mariadb.com/kb/en/contributing-to-the-mariadb-project/
Aug 14 20:46:07 mdg16.wvu.edu mysqld_safe[8439]: 170814 20:46:07 mysqld_safe Logging to '/var/log/mariadb/mariadb.log'.
Aug 14 20:46:07 mdg16.wvu.edu mysqld_safe[8439]: 170814 20:46:07 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Aug 14 20:46:09 mdg16.wvu.edu systemd[1]: Started MariaDB database server.

You can also confirm that you can get initial access from the command line

$ mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 5.5.52-MariaDB MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

MariaDB [(none)]> quit
Bye

Secure the installation

There is a basic securing script that will associate a password to the root account and removing the test database entirely.

Just press enter when asked for the root password inside MariaDB. That root account have nothing to do with the root account of the system, the passwords does not need to match.

$ mysql_secure_installation

NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MariaDB
      SERVERS IN PRODUCTION USE!  PLEASE READ EACH STEP CAREFULLY!

In order to log into MariaDB to secure it, we'll need the current
password for the root user.  If you've just installed MariaDB, and
you haven't set the root password yet, the password will be blank,
so you should just press enter here.

Enter current password for root (enter for none): 
OK, successfully used password, moving on...

Setting the root password ensures that nobody can log into the MariaDB
root user without the proper authorisation.

Set root password? [Y/n] 
New password: 
Re-enter new password: 
Password updated successfully!
Reloading privilege tables..
 ... Success!


By default, a MariaDB installation has an anonymous user, allowing anyone
to log into MariaDB without having to have a user account created for
them.  This is intended only for testing, and to make the installation
go a bit smoother.  You should remove them before moving into a
production environment.

Remove anonymous users? [Y/n] 
 ... Success!

Normally, root should only be allowed to connect from 'localhost'.  This
ensures that someone cannot guess at the root password from the network.

Disallow root login remotely? [Y/n] 
 ... Success!

By default, MariaDB comes with a database named 'test' that anyone can
access.  This is also intended only for testing, and should be removed
before moving into a production environment.

Remove test database and access to it? [Y/n] 
 - Dropping test database...
 ... Success!
 - Removing privileges on test database...
 ... Success!

Reloading the privilege tables will ensure that all changes made so far
will take effect immediately.

Reload privilege tables now? [Y/n] 
 ... Success!

Cleaning up...

All done!  If you've completed all of the above steps, your MariaDB
installation should now be secure.

Thanks for using MariaDB!

Creating an non root user with privileges on the database

Assuming that there is a user called 'mdg' we can give that user priviledges to read and write all databases running on the system. You can also filter to be just one database and limit to just read.

$ mysql -u root -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12
Server version: 5.5.52-MariaDB MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> CREATE USER 'mdg'@'localhost';
Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON * . * TO 'mdg'@'localhost'; FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
+--------------------+
3 rows in set (0.00 sec)

MariaDB [(none)]> quit
Bye

Creating a New database

Lets create a new database called oqmd that will be using to recreate from the dump

$ mysql -u root -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 16
Server version: 5.5.52-MariaDB MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> create database oqmd;
Query OK, 1 row affected (0.00 sec)

MariaDB [(none)]> use oqmd;
Database changed
MariaDB [oqmd]> quit
Bye

Recreating the OWMD database from the dump

The webpage for the OQMD database is http://oqmd.org The most recent database can be download from

http://oqmd.org/download/

By the time this tutorial was written the line to download the most recent database is

wget http://oqmd.org/static/downloads/qmdb__v1_1__102016.sql.gz

Once you download the compress dump, uncompress it with

gunzip qmdb__v1_1__102016.sql.gz

Once the file is uncompress recreate the new database with:

mysql oqmd -u root -p < qmdb__v1_1__102016.sql


Once the database is created you can test the number of entries

$ mysql -u root -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 21
Server version: 5.5.52-MariaDB MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use oqmd;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [oqmd]> SELECT COUNT(*) FROM entries;
+----------+
| COUNT(*) |
+----------+
|   471857 |
+----------+
1 row in set (0.07 sec)

MariaDB [oqmd]> quit
Bye

The OQMD database is ready and we can continue creating the PyChemiaDB database.


Installing MongoDB

The best way of keeping an updated version of MongoDB is using the repository from the developers As root create the file

 emacs /etc/yum.repos.d/mongodb-org-3.4.repo

And write the following inside:

[mongodb-org-3.4]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.4/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-3.4.asc

After that simply install mongo from the repository:

sudo yum install -y mongodb-org

Activate the service and enable the automatic start for the next reboot of the machine

$ systemctl enable mongod
$ systemctl start mongod

Test that the service is actually running

$ systemctl status mongod
● mongod.service - High-performance, schema-free document-oriented database
   Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-08-15 00:43:04 EDT; 5s ago
     Docs: https://docs.mongodb.org/manual
  Process: 11328 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 11325 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 11323 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)
 Main PID: 11334 (mongod)
   CGroup: /system.slice/mongod.service
           └─11334 /usr/bin/mongod -f /etc/mongod.conf

Aug 15 00:43:04 mdg16.wvu.edu systemd[1]: Starting High-performance, schema-free document-oriented database...
Aug 15 00:43:04 mdg16.wvu.edu systemd[1]: Started High-performance, schema-free document-oriented database.
Aug 15 00:43:04 mdg16.wvu.edu mongod[11331]: about to fork child process, waiting until server is ready for connections.
Aug 15 00:43:04 mdg16.wvu.edu mongod[11331]: forked process: 11334
Aug 15 00:43:05 mdg16.wvu.edu mongod[11331]: child process started successfully, parent exiting

Use the command line interface to actually enter in mongo and check the databases created

$ mongo
MongoDB shell version v3.4.7
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.7
Server has startup warnings: 
2017-08-15T00:43:04.958-0400 I CONTROL  [initandlisten] 
2017-08-15T00:43:04.958-0400 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] 
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] 
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] ** WARNING: You are running on a NUMA machine.
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] **          We suggest launching mongod like this to avoid performance problems:
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] **              numactl --interleave=all mongod [other options]
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] 
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] 
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-08-15T00:43:04.959-0400 I CONTROL  [initandlisten] 
> show dbs
admin  0.000GB
local  0.000GB
> exit
bye

For the time being we can just ignore the performance warning above, we will concentrate in creating the new PyChemiaDB database.

Installing qmpy

The python package qmpy serves as interface between the OQMD database and python. Pychemia uses qmpy to search for the best candidate for each entry on the database. We will download qmpy directly from GitHub. If the command git is not present install it with (as root or with sudo)

yum install git

Now, download qmpy from the official repository

git clone https://github.com/wolverton-research-group/qmpy.git

qmpy has a number o prerequisites many of them are also needed by pychemia. Install all the prerequisites with pip

If you do not have pip installed, install it with yum (CentOS/RHEL) we need also the development packages for compiling some of the packages that are not entirely python code. For RHEL 7.4 use the command:

sudo yum install python2-pip python34-pip python-devel python34-devel

Most scientific software uses quite recent versions of many packages, it is in general not a very good idea rely on the packages provided by the official repositories of the Linux Distribution. Using pip you can install packages that are far more recent. That is the path that we will follow to satisfy all the dependencies. The next commands assumes that you are now using a personal account,

The first step is to use pip to upgrade pip to the latest version

sudo yum install --upgrade pip --user

Using the option --user will install the package on your home folder, usually ~/local You can add that ~/local/bin to your path in order to give preference to the programs that you install there. One of the dependencies, spglib requieres a more recent version of setuptools we need to update that package too

~/local/bin/pip install --upgrade setuptools --user

Now we can proceed to install the most recent versions python packages need by qmpy and PyChemia.

~/local/bin/pip install --upgrade django pulp numpy scipy matplotlib networkx pytest python-memcached \
ase django-extensions lxml pyparsing spglib pycifrw pyyaml scikit-learn pymongo future nose --user

The final dependency that still remains is mysql-python, this python package needs mariadb-devel, if you did not install it before this is the moment to do it

yum install mariadb-devel

You can test that you have a command called mysql_config available on your system.

$ mysql_config --version
5.5.52