Categories
Admin

Fighting forum spam

As one of the managers of Ignite Taipei, I’m trying to come up with new ways to let the community communicate, new ways to share information, advice and all. A while ago I have set up a forum at http://bbs.ignitetaipei.tw/ and I thought that will be an interesting experiment. Well, so far it is useless for communication, but turned out to be a very interesting experience from the sysadmin point of view.

I used FluxBB, because it looked simple enough, seemed to be quite fast (for low traffic volume at least), and well configurable. Except that within a very short time I run into a spam problem, so many fake users registered, and lots of algorithmically generated garbage text with a bit of advertisement here and there.

First I looked into FluxBB’s own solutions, and looks like it might not have been a great choice, because many of the spam-fighting plugins are out of date, or not supported anymore, or just a real pain to set up. The immediate practical step I could take was updating my security questions, roll my own version of “written with words, how much is 5 + 4?”, the regular low-tech captcha on FluxBB. Looks like the original answers are already in the database everywhere, so had to write my own set, which seemed to work for a while, cutting down on red-flagged registrations. But it’s not ideal, since I want to make this a dual-language forum (Ignite Taipei has both English & Chinese as official language).

Instead I turned on email confirmation. When someone registers, the password is sent to their email and have to use that to sign in. It was okay for a tiny bit, then crazy registration boom happened. I think I might be the only one real member of the board (I said that it is a failure so far for communication:) and there are 500 other spam members. Looking at their email addresses, it seems all of them have Hotmail. That kinda suggests a giant failure at Hotmail to restrict automatic registration, which is probably a problem overall. I cannot just throw out Hotmail addresses either, because it’s a popular mail provider here in Taiwan too (my first email was Hotmail too, but that was a looooong time ago, before it was Microsoft property).

So captcha don’t work, email don’t work. What to do instead? At the time I was playing around with Cloudflare, to act as an easy to use CDN. I tried it before for our Ignite Taipei blog, which is hosted on Tumblr, and that doesn’t play well with Cloudflare unfortunately. Couldn’t use it for this blog before because of my DNS provider, but now I switched, so started playing with it again.

The dashboard of the Cloudflare interface
Cloudflare stats snapshot (parts of it)

Instead of enabling Cloudlfare for the entire ignitetaipei.tw domain, just turned it on for the forum, since it’s hosted elsewhere. And that totally did it. Spam stopped that very moment, and haven’t returned since. I think what happens is that Cloudflare knows globally a lot of web/forum/email span hosts, and can challenge them or generally ignore them. Can even see where those spammers are coming from.

List of captured threats on the Cloudflare threats console
Cloudflare Threats Console

One weird (but actually not that surprising) thing is that the most active web crawler on the site (Cloudflare gives that info as well) was Baidu by far, so I guess more people knew about the site in China than elsewhere. Why’s that? Some forums that share vulnerable sites, or something like that? I barely had any Chinese content at that time, so it cannot be that. And since I turned on the threat control part, Baidu seem to have dropped quite a bit (submitted the site to Google so now that’s the busiest crawler).

All in all, Cloudflare is an interesting experiment. I can really mess up my DNS with it, and could blocked my own site for several hours, but in general it worth it. Just have to be careful. For example when testing, use their own name servers to check the information, and maybe instead if “automatic” time-to-live, set some very short time first. I usually use Google’s 8.8.8.8, and they pick up the first wrong setting really quickly, then it takes hours to pick up the correction I made just minutes after the first one.

After a bit of playing around, at least I have no spam anymore (keep fingers crossed). Now just have to get people to use the forums. :)

Categories
Computers

Editing talk videos for Ignite Taipei

As an organizer for Ignite Taipei, there are plenty of technical things that I need to take care of. Managing people’s slides (plenty of fun there), runing the presentations (it needs the Ignite type 15s autoadvance), setting up web stream (where everything can go wrong), recording the talks themselves. One thing I haven’t done yet, is editing the videos.

For all previous events we got some more experienced people to help out. The results were okay, nothing surprisingly great but plenty usable. It still did take a lot of time to do that, and I don’t like to ask people favours. Instead this time, for Ignite Taipei #7, after almost a month of procrastination I decided to do it myself. Ignite is an experimental ground for me, where I try a lot of things I think about startups (even if it is a non-profit organization, it can have a startup like “organization”), and I thought if I was a CEO I would make it such that “the buck stops” really here with me. Have to experience the work myself to be able to find better people doing it later. And I like video editing (from the good-old DVD ripping days, that came really handy this time as well).

Recording

The recording was done with my Canon EOS550D. When I got the camera, I could have gotten a 600D, but I thought “I won’t be using this for videos!” Duh! Now I have to put up with freaking large files, because 550D cannot compress them on the fly. Also, need to check whether I can get better SD cards, because at 1080p it broke the recording last time (that’s why the Ignite #6 videos are still not edited). 720p (at 50fps) seemed to work somehow, and looks good too. Will try to fix that still in the future. Now I have a GoPro Hero3, that  can do 1080p without even breaking sweat, and I plan to use that one for the next Ignite.

Title screen

A video starts with good title screen. Previous editors could manage animations and all the works, which I didn’t have any idea how to do this time, so just keep it simple: a static page. But still have to add my own spice: not just a simple title and header, but something good-looking, and expressing the spirit of our organization. So got a cool HDR photo of Taipei City (well, actually Zhonghe, the suburbs to the south) from my friend, Chia-Da Hsu, tuned it a bit to be a background image, and plugged it into Inkscape.

Editing window of Inkscape
Creating the video titles in Inkscape

I like Inkscape a lot, open source, capable, fun.

Put in the background, position the text. The font used by Ignite, and thus for most of our own text is Insignia LT. To make it more readable (20% gray setting), make a black outline to the text: Copy/Paste in Place/Stroke paint to black/Make it thick (20px thickness for us), and blur with a suitable parameter value (was 2 for us), then move the layer one below. Voila, outlined text, drapper. Just had to make one, then replace the text (and recreate the outline).

Editing

I needed to put together the video with the slides, side by side, with synchronized timing, so it would look similar to the original talk. First, I needed to get the slides in image format, that I keep doing with ImageMagick, whit this script:

Unfortunately, the ImageMagick on my ArchLinux seem to be chocking on this script, that used to work so well 6 months ago. Looks like it outputs only the first slide. I had another computer with an older version of ImageMagick and it worked like a charm. Got to investigate it – or wait until the next Ignite and hope the problem goes away.

Next I had to find a suitable software for editing. As much as I looked, Linux didn’t have many good ones, or even many of any kind. Avidemux don’t like MP4, and it has bitten me in the butt before, Cinelerra didn’t work for some other reason, and these are the two in the official ArchLinux repo that I could find. Had to look more to get to Flowblade, an editor written in Python. I got to say, it’s such an amazing bugfest, that it drove me nuts sometimes, but got the work done, and that’s more than I can say about the rest of the candidates.

Combining the video on one channel, and the slide images on the other channel, translation to move them side by side, then finally a blending compositor so that the layers don’t overlap.

Flowblade window during editing
Combining the video and slides in Flowblade

Looks quite good, was a rocky start to figure out how it went, but when it went well, then I got one talk done in about 10 minutes, which is great! It helps that the Ignite format is predictable (15s each, could just add an image, and found the new insertion point for the next one in a very short time. Though now I see that Acrobat Reader that I use to display the slides, don’t do accurate timing usually.

Then it came the rendering. Originally I used a pre-compressed version of the videos (from 2Gb for 5 minutes to about 200MB), but the quality just wasn’t that good after the joint rendering. Went back to the original source, and tweaked the parameters. In the end, discarded the entire stock parameter setting, and used these ones manually. To get a good result, the important ones are preset (from ultrafast to veryslow, increasing quality) and crf (lower number means better quality, 18 is good, 17 a bit overkill because I like that, about 20% larger file than 18):

Too bad that Flowblade cannot do 2-pass encoding, maybe can do that another time differently, for better quality and smaller filesize. Also, had to make sure that I marked the beginning and the end of the video well and just rendered that part, otherwise it crashed right at the end of the job. Not good, considering that it took about 40 minutes each to be done.

Flowblade is unfortunately using absolute paths to remember things, and cannot edit the settings very easily, so couldn’t move the rendering onto a more powerful machine. Maybe could make a tool for save-file conversion, it’s all Python anyways…

When that done, I remembered from listening to a lot of Youtube videos, that the sound had something to desire, and all I had was puny DSLR microphone recording, so I run it through a normalization to tweak the level of the audio.

Upload

Time to upload to Youtube. Fortunately the network is great here in Taiwan, so it was a very short time, sometimes the postprocessing took longer than the actual upload.

The Youtube video info editing window
Editing all the options for the uploaded video on Youtube

From the settings I always go for Creative Commons, not that anyone had remixed our videos yet. Also set some tags, maybe that can improve our visibility in search, as well as the recording date and location. The description writing is tricky. Most of our videos are found through recommendation on the Youtube website, so it should be good. I wonder what would make it work better?

Ready to share

All that done, just created a playlist to tie the videos together, and found this new (to me at least) look, which is not too bad:

Youtube playlist window with all the uploaded videos
Finally, got the playlist ready

Now, after 2 days we have over 200 views, which is not bad considering that I could barely advertise yet, only friends sharing on Facebook and such (and you know how Facebook hides everyone’s precious updates, so it’s a surprise we even have this many).

Finally, the results can be all enjoyed.

 

Categories
Computers

Tech setup for an Ignite

Recently I was co-organizing our Ignite Taipei #2 event (see the pictures and watch the talks – this latter if you speak Chinese…) I try to be the self-proclaimed Chief Technology Officer (CTO) of the event. Either because I hope the tech side of things goes down well, or more likely if things fail then there’s no-one else to blame just myself. And things do indeed fail all the time all different ways.

So as a CTO I try to make sure that everything computer related runs well and I did collect some useful scripts (for that mostly command-line driven way I do things). I thought it would be useful to write them all up, mostly for me to remember, not just the scripts themselves but the rationale of some choices made along the way.

As Ignite’s motto is “Inspire us but make it quick”, I found that that the Ignite organizer’s motto can be “Everything will be just fine.”

The computers I used for getting things ready for the show
The computers I used for getting things ready for the show

Intro

Ignite in short is an evening of quick presentations, each one of them exactly 5 minutes, 20 slides that auto-advance every 15 seconds. Altogether there are 10-16 talks on evening, It aims to be an event for inspiring people, and something that should be relatively straightforward to organize.

Computer setup

I had 2 computers for the event, but those were actually 3 systems:

  • A Linux system (Arch Linux), my own computer that I knew very well and run the presentation off.
  • A Linux/Windows dual boot system that I borrowed, Windows (Vista of all things) for PPT-PDF conversion and Ubuntu for live streaming.
If I could, I’d ditch the Windows part altogether, but if can’t it would probably be better to run it in a virtual machine (so I don’t have to reboot, more on this later) or on a separate (third) computer.

Pre-event

Most of the organization was done on Google Docs with a shared document, keeping tabs of who we have as speakers, what needs to be done.

Shameless plug: next time it will be even easier with WatchDoc, a Chrome extension I wrote after Ignite to get notification when shared documents change :)

Besides keeping up with what to do, I had to take care of the presentations that the speakers sent to me. Since I’m not a fan of PowerPoint and its unreliability to display things the same way on different computers (and LibreOffice not being ready to handle .ppt/.pptx as well as I’d like it to), I had to convert everything into PDF. People with Mac, using Keynote were easy, they have PDF export. Most people using Windows/MS Office didn’t have. First I installed a “print to PDF” plugin but that was just terrible, awful quality photo conversion and all. In the end I had to get an Office 2007 just for this occasion and use their own Save to PDF add-in, for great good. Actually, it worked like a charm, I just wonder why they didn’t make the same thing for Office 2003 that seems to be much more abundant (I know, 8 years old software has no love). Finally I had everyone’s thing in the same, reliable format, and uploaded to Google Docs and Dropbox, so can transfer it between computers easily.

One more thing I had to do the the slides: add an empty slide on the end of the 20 slides of each presentation, so it’s easy to see when they are finished. Used pdfmanipulate for that. Prepare an empty slide, can use e.g. LibreOffice, export it as empty.pdf, then having all the talk PDFs in a a sub-directory called “original, run a script like this:

#!/bin/bash
DIR=original

for f in $DIR/*.pdf
do
    remf="${f%.pdf}"
    newf="${remf##*/}"
    echo "${newf}"
    pdfmanipulate merge --output="${newf}_extra.pdf" "$f" empty.pdf

Slides

The tech setup in the front; From Ignite Taipei #2

There were two kinds of slides to take care of: one with the speakers’ names and their talks titles, the other one is the talks themselves, prepares as described earlier.

The speaker intro slides were done in LibreOffice. Had a little problem with that one as well, as if Chinese characters were set to bold text, they showed up blurred. Not good looking, lot even legible sometimes. So just before start I was scrambling to change all text from bold to normal. It’s just a pain.

The talk slides were shown using Impressive, a command line presentation software written in Python. It is pretty good, quite flexible and easy to set the slide show times, transitions and total time progress bar.

Use the following script saved as e.g. present.sh to show the talk as present.sh nextalk_extra.pdf:

#!/bin/bash
# Run the presentation

PRESENTATION="$1"
SLIDETIME=15
TOTALTIME=300

./impressive.py -D 1 \
                -a ${SLIDETIME} \
                -d ${TOTALTIME} \
                -T 350 \
                -t CrossFade \
                -c persistent \
                "${PRESENTATION}"

There are some problems with it, though. Sometimes it took quite a while to show a slide that had a good quality photo, which messed up the timing. The total time didn’t quite work out the way I expected, I think every presentation was a bit longer than 5 minutes overall. Not too big problem., but if I can do right, then I could.

Previously used the Adobe Acrobat Reader’s auto-advance when full screen mode, that was just fine, might go back there next time if I cannot fix Impressive.

Web streaming + recording

A common thing to live-stream Ignite so other people can watch it to. I found it easiest to do with VLC + Justin.TV, they already have some software done that seems to work quite well. The first time I had some sound sync issues, this time managed to fix that, but in the end it didn’t matter.

To do the stream using my Logitech QuickCam S7500 and sound on the line-in from the venue’s sound system, start VLC with the following script:

#!/bin/bash
# http://community.justin.tv/forums/showthread.php?t=7081
# Using webcam + line-in audio
# Display + Transcode (Save + Stream)
# needs jtvlc to get it out

vlc v4l2:///dev/video0 \
     :input-slave=alsa://plughw:0,0 \
     --sout='#duplicate{dst=display,dst="transcode{venc=x264{keyint=60,idrint=2},vcodec=h264,vb=600,acodec=mp4a,ab=128,channels=1,deinterlaceaudio-sync}:duplicate{dst=standard{access=file,mux=mp4,dst=/home/user/ignite.mpg},dst=rtp{dst=127.0.0.1,port=1234,caching=2000,rtcp-mux,sdp=file:///tmp/vlc.sdp}}"}'

It as some mighty long line and not sure if it can really be wrapped anywhere. Basically it displays the webcam’s picture, transcodes it to x264 and sends it to an rtp socket and saves into a file in the same time. Might want to be careful with this, as the saved file will be overwritten every time the script is run.

The most curious thing is that in the copy that is displayed, the sound is not in sync, but actually the transcoded video is okay.

Next step is sending the stream out to Justin.tv, using jtvlc:

#!/bin/bash
./jtvlc-lin-0.41/jtvlc ${JUSTINTV_USERNAME} ${JUSTINTV_STREAM_KEY} "/tmp/vlc.sdp" -d

Here the username and stream key has to be filled in with your values. If everything’s fine then there’s a big stream of debug information on the screen and on the website the channel goes online.

At our event I had a problem that after reboot somehow the sound input is borked and all out video was without sound, the final saved file couldn’t even be opened. Never mind, fortunately we had recording from a proper camera.

HD recording + postprocessing

It is also essential to have some good recording of the show, since that can make it really reach a wider audience, and it’s good to look back at it later as well.

This time we had a cameraman helping out, and after the talks finished I got the videos from him. He’s using a Sony camera with a FAT32-formatted SD card, which was a nice pain to manage. Last the video we had was straightforward mp4 format, while this time 2Gb chunks of AVHCD, a proprietary format. Had to convert that into something I can manage.

It took a wile to figure out, but in the end the result was acceptable.

1) join the spilt parts with tsMuxeR. Had to make sure I’m using joins, not just listing all the files and saving them as one (sounds like the same thing but it isn’t). In the end I had some “.m2ts” files

2) Next I had to convert those into mp4. The m2ts files had actually h264 video that is good for mp4, so ended up just copying it. The audio had to be transcoded to aac, because the original was ac3 while that’s not accepted audio codec for mp4. Fortunately ffmpeg could handle AVHCD files now. The only really tricky part was that the original video was 1080i – interlaced. Next time have to make sure that the cameraman sets things to progressive, makes life so much easier.

#!/bin/bash
INFILE=$1
OUTFILE=$2
# Start recreating video
ffmpeg -deinterlace \
       -i ${INFILE} \
       -f mp4 \
       -vcodec copy \
       -strict experimental \
       -acodec aac \
       -ab 128k \
       -y ${OUTFILE}

This took quite short time, which is a relief, before I figured out that things can be this simple I had transcoded files in hundreds of giga-bytes, and things took hours.

3) Spilt the video for uploading to YouTube. Used Avidemux for that, which can – barely – handle h264 encoded video. Somehow it couldn’t find the key-frames so it decided that every 30th frame must be that and I can only split things there. This resulted in every video having some strange pictures in the beginning (duh, missing keyframe). Might be better with re-encoding, or rather finding a better program to handle h264 video.

Future

Fixing stuff at halftime; From Ignite Taipei #2

Of course there are some things to improve next time:

  • Better video recording that eases post-processing. Preferably have our own camera that we learn how to use and not have to figure out something new every time
  • One a computer is set up and tested, no more reboots
  • Check the timing of the slides to make sure they are really 15 seconds each
  • Improve Impressive, maybe prepare some patches: exit when presentation finished, show empty screen after last slide, timing issues
  • Switch to scripted intro slides so I don’t have to edit each of them to make sure they look the same. Maybe use one of the Javascript web presentation frameworks and a full-screen Chrome window…
  • Set up our own website for Ignite. This one is the big one. Should allow us to do a lot of interesting things, but I try get myself to make it first with a few features only, not with all bells and whistles. A WordPress page with some plugins or a full blown Django site or something else? Time will tell…
Categories
Taiwan

Igniting Taipei

A few weeks ago with a couple of friends we started to organize the first Ignite Taipei. There’s still 5 weeks and a bit to go, but it has already been a fun experience. In many ways, starting a community feels very similar to how doing a startup would feel (I imagine). No surprise there, the startups I would want to create would want to have a great community. :)

So far it is mostly about choosing the place, the time, starting to invite people, keeping in touch with them, building involvement by others and keeping those “fans”. It’s shaping up nicely, but there’s a lot more to go, we are not ready yet.

Another connection I found with doing an business: the best way to build up one’s own enthusiasm is to be as closely involved as possible. I keep watching Ignite videos on YouTube and sharing them. Writing a blog about what’s going on. Talking to people about it and see what they are interested in. Since at the actual event I think I will be managing the technical issues, there’s one thing I haven’t thought about before: what kind of talk would I give? How would I use my 5 minutes / 20 slides to have an impact? Unless I know that, I cannot really recruit speakers well, cannot help them effectively and would miss out on the core of the things. Also, it does help to exercise my idea muscles [1].

Ignite talk brainstorming
What _else_ to talk about?

Here’s the copy of the brainstorming I had today while I was waiting for my lunch:

  • 30 day challenge: take different bus route I haven’t taken before
  • 100 uses of measuring time
  • Hungarian for dummies
  • Comparative tea-ology
  • Feynman’s spaghetti-braking experiment
  • “How to measure the high of the lighthouse with a barometer”
  • Camino de Santiago
  • Organizing Ignite
  • Geocaching
  • Version control systems for fun and profit
  • Rejection Therapy
  • Startupbus
  • Everyday physics
  • A very short introduction to <insert author’s name here> (e.g. Palahniuk, Vonnegut, Beckett)
  • Kitchen in a pot: the electric rice cooker
  • Long distance travelers of ancient times
  • 100 uses of a wiki
  • Open-source hardware
  • Movie stars’ movies before they became really famous
  • All those different ways of brewing coffee

These I think fall into two categories: things I know a little about, and things I know too little about but would use Ignite as an excuse to learn more. Actually, since I wrote up this list more ideas keep flowing in and I think I will have to prepare some of these, even without a plan to show them to anyone: why would one need an excuse to do something awesome?

Any more ideas to talk about?

[1] “Idea muscles” come from James Altucher, one of my favorite blogger/writer lately. It is the habit of being creative, or by his word:

Every day I write down ideas. I write down so many ideas that it hurts my head to come up with one more. Then I try to write down five more.

I’m not that good at this just yet. The list above is as long as it is because that’s where my page got full. Not as if there are no 97 other, empty pages in my notebook… Maybe I’m too pain averse, but got to overcome that. I actually long for the feeling of doing as many ideas that it hurts thinking more…