1. word2vec train procedure explained

    Abstract

    This article gives a detailed explanation of the weight vector updates in word2vec C implementation.

    And find some not correct procedures that used train skip-gram model in the code.

    Introduction

    Word2vec is a famous word embeddings approch. See detailed at Word2vec

    Prerequisite

    some properties of sigmoid function

    (0)

    $$ \sigma' (y) = \sigma (y) \cdot [ 1 - \sigma (y) ] \cdot y' $$

    Read more...


  2. test render math

    Test render latex math using plugin render_math

    inline latex \( log'(x) = \frac{1} {x}\)

    div latex

    $$ \sigma'(x) = \sigma(x) \cdot [ 1 - \sigma(x) ] $$

    Read more...


  3. 使用tesseract-ocr和opencv识别视频中文字

    Abstract

    本文使用tesseract-ocr和opencv提取B站2015年舞蹈总排行榜视频中的Id号, 然后使用you-get工具下载提取到的视频。视频包下载地址见Result

    Introduction

    在B站看到有人出了2015年舞蹈区总排行榜TOP100视频(链接),看完视频感觉都还不错, 有些还没看过,想着既然能在这个排行榜上,那就应该不错,值得下载下来收藏。

    那怎么把里面的视频下载下来呢?

    我的思路是这样的: 1. 排行榜视频中提到的所有舞蹈区视频都贴上了Id号,就是av后面跟的六位数字。 2. 获取到Id号,通过you-get(官网)工具将相应视频下载下来。

    那怎么获取视频的Id呢?

    简单的做法是对着排行榜视频,一个一个的手动记录下来。 这种方法精确率应该能达到100%,但是伤害眼睛。

    程序猿可是要好好保护眼睛的。

    也学了4, 5年coding了,想着通过程序能不能提取出排行榜视频里的所有Id号。

    Implementation

    有了想法,当然也要有方法。

    想到之前在cloudera blog上看到的一篇文章 ...

    Read more...


  4. Spring Rest server and Apache Avro objects

    1. Error handle class

    @ControllerAdvice
    public class ErrorHandler {
        @ExceptionHandler(value = Exception.class)
        @ResponseStatus(HttpStatus.BAD_REQUEST)
        @ResponseBody
        public ErrorResponse errorResponse(Exception exception) {
            return exception.getMessage();
        }
    }
    

    then we can got clear error message

    Reference

    http://www.importnew.com/7903.html

    2. Change response's content-type

    @RequestMapping(method = RequestMethod.GET, produces = MediaType.APPLICATION_JSON_VALUE ...

    Read more...


  5. Hadoop secure Mode配置

    介绍

    老板让配Hadoop的安全模式,网上搜了些资料, 具体过程略有出入,下面记录详细配置过程,

    环境

    Hadoop Version: Apache Hadoop 2.7.1

    /etc/hosts:

    192.168.1.100   hadoop1
    192.168.1.101   hadoop2
    192.168.1.102   hadoop3
    

    hadoop1是master, 另外两台是slaves note:确保域名解析和反向域名解析在集群中正常工作

    /etc/profile:

    #Hadoop
    HADOOP_VERSION=2.7.1
    export HADOOP_PREFIX=/opt/hadoop-${HADOOP_VERSION}
    export HADOOP_HOME=${HADOOP_PREFIX}
    export ...

    Read more...


  6. C++ link and library usage

    When use a programming language produtively, We need use third parties libraries.

    If we install libraries and header files in /usr/lib, /usr/local/lib and /usr/include/, /usr/local/include, compilers will find them automatically. But when libraries and headers are not in system paths, How do you tell ...

    Read more...


  7. backslash in C++

    Use backslash to format long long lines. Oops, I nerver use it in C++.

    How to use backslash?

    This is an example:

    #include <iostream>
    #include <string>
    int main() {
        int a = 1, \
                b = 2;
        int c = 3,
            d = 4;
        std::string s = "fffff" \
                         "fffff\\n"; // right
        std::string s2 = "fffff"
            "fffff ...

    Read more...


Page 1 / 10 »