Closure of DFI effective 10th July 2024

Status
Not open for further replies.

SKC

New Member
Joined
Aug 16, 2014
Messages
9,483
Likes
32,249
Country flag
Guys calm down a bit.

Take rest for a day or two.

Let us think together about possible solution.

I will download latest version of PHP Forums CMS and tinker about a bit.

Lets see how to take care of the hosting issues.
 

Extraordinary

New Member
Joined
Apr 5, 2024
Messages
184
Likes
335
Country flag
What about those who were refugees from BRF in the first place lol?
Guess it's time to transcend to the higher plain of existence Twitter i mean
A forum has different environment all together.
Even signing up to this forum requires certain IQ.
Facebook Twitter can be used by anyone, this platform offers a difdent type of engagement
 

Jor_Se_Bolo_PKMKB

New Member
Joined
May 30, 2019
Messages
402
Likes
2,790
Country flag
From chatGPT... I will attempt to scrape data tomorrow

# Scraping Data from DefenceForumIndia

## Overview

The process of scraping a website, especially a large one like a forum, involves several steps and considerations. Below is a high-level overview of what you need to do to scrape data from DefenceForumIndia.

### Steps

1. **Check the Site's Terms of Service**: Ensure that scraping the site does not violate any terms of service or legal agreements.

2. **Identify the Structure**: Analyze the forum's structure to understand how the data is organized (e.g., threads, posts, user profiles).

3. **Choose a Scraping Tool**: Use tools like `BeautifulSoup` and `requests` in Python, or specialized scraping frameworks like Scrapy.

4. **Write the Scraping Script**:
- Start by writing a script to crawl and download the HTML content of each page.
- Parse the HTML to extract the relevant data (e.g., post content, authors, timestamps).
- Store the extracted data in a structured format (e.g., CSV, JSON, database).

5. **Handle Pagination**: Forums usually have multiple pages for threads and posts. Make sure your script can navigate through these pages.

6. **Be Mindful of Rate Limits**: To avoid getting banned or overloading the server, implement delays between requests and respect any rate limiting specified by the website.

7. **Data Storage**: Decide where to store the scraped data. Options include local files, databases, or cloud storage.

8. **Backup and Redundancy**: Regularly back up the data to prevent loss in case of interruptions during scraping.

### Example Scraping Script

Below is a simplified example using Python with `requests` and `BeautifulSoup`:

```python
import requests
from bs4 import BeautifulSoup
import time
import csv

base_url = 'http://defenceforumindia.com/forums/'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}

def scrape_forum(forum_url, output_file):
session = requests.Session()
session.headers.update(headers)

with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
fieldnames = ['thread_title', 'post_content', 'author', 'timestamp']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()

page = 1
while True:
url = f"{forum_url}page-{page}"
response = session.get(url)
if response.status_code != 200:
break

soup = BeautifulSoup(response.content, 'html.parser')
threads = soup.find_all('div', class_='structItem--thread')
if not threads:
break

for thread in threads:
thread_title = thread.find('div', class_='structItem-title').text.strip()
thread_link = thread.find('a', class_='structItem-title')['href']
thread_url = base_url + thread_link

# Scrape individual thread
thread_response = session.get(thread_url)
thread_soup = BeautifulSoup(thread_response.content, 'html.parser')
posts = thread_soup.find_all('div', class_='message-content')
for post in posts:
post_content = post.find('div', class_='bbWrapper').text.strip()
author = post.find('h4', class_='message-name').text.strip()
timestamp = post.find('time')['datetime']

writer.writerow({
'thread_title': thread_title,
'post_content': post_content,
'author': author,
'timestamp': timestamp
})

page += 1
time.sleep(1) # Be respectful with your requests

scrape_forum('http://defenceforumindia.com/forums/some-forum/', 'forum_data.csv')
 

Extraordinary

New Member
Joined
Apr 5, 2024
Messages
184
Likes
335
Country flag
From chatGPT... I will attempt to scrape data tomorrow

# Scraping Data from DefenceForumIndia

## Overview

The process of scraping a website, especially a large one like a forum, involves several steps and considerations. Below is a high-level overview of what you need to do to scrape data from DefenceForumIndia.

### Steps

1. **Check the Site's Terms of Service**: Ensure that scraping the site does not violate any terms of service or legal agreements.

2. **Identify the Structure**: Analyze the forum's structure to understand how the data is organized (e.g., threads, posts, user profiles).

3. **Choose a Scraping Tool**: Use tools like `BeautifulSoup` and `requests` in Python, or specialized scraping frameworks like Scrapy.

4. **Write the Scraping Script**:
- Start by writing a script to crawl and download the HTML content of each page.
- Parse the HTML to extract the relevant data (e.g., post content, authors, timestamps).
- Store the extracted data in a structured format (e.g., CSV, JSON, database).

5. **Handle Pagination**: Forums usually have multiple pages for threads and posts. Make sure your script can navigate through these pages.

6. **Be Mindful of Rate Limits**: To avoid getting banned or overloading the server, implement delays between requests and respect any rate limiting specified by the website.

7. **Data Storage**: Decide where to store the scraped data. Options include local files, databases, or cloud storage.

8. **Backup and Redundancy**: Regularly back up the data to prevent loss in case of interruptions during scraping.

### Example Scraping Script

Below is a simplified example using Python with `requests` and `BeautifulSoup`:

```python
import requests
from bs4 import BeautifulSoup
import time
import csv

base_url = 'http://defenceforumindia.com/forums/'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}

def scrape_forum(forum_url, output_file):
session = requests.Session()
session.headers.update(headers)

with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
fieldnames = ['thread_title', 'post_content', 'author', 'timestamp']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()

page = 1
while True:
url = f"{forum_url}page-{page}"
response = session.get(url)
if response.status_code != 200:
break

soup = BeautifulSoup(response.content, 'html.parser')
threads = soup.find_all('div', class_='structItem--thread')
if not threads:
break

for thread in threads:
thread_title = thread.find('div', class_='structItem-title').text.strip()
thread_link = thread.find('a', class_='structItem-title')['href']
thread_url = base_url + thread_link

# Scrape individual thread
thread_response = session.get(thread_url)
thread_soup = BeautifulSoup(thread_response.content, 'html.parser')
posts = thread_soup.find_all('div', class_='message-content')
for post in posts:
post_content = post.find('div', class_='bbWrapper').text.strip()
author = post.find('h4', class_='message-name').text.strip()
timestamp = post.find('time')['datetime']

writer.writerow({
'thread_title': thread_title,
'post_content': post_content,
'author': author,
'timestamp': timestamp
})

page += 1
time.sleep(1) # Be respectful with your requests

scrape_forum('http://defenceforumindia.com/forums/some-forum/', 'forum_data.csv')
At best you will scrap some text. Photos won't be scrapped that easy.
 

hurrians

New Member
Joined
Nov 24, 2023
Messages
1,113
Likes
1,926
Country flag
httrack website copier , don't know wether it works , ofcourse with admin approval

Someone here directly or indirectly maybe knowing the owner, hope they are in touch (in person) and will give clarity going forward
 

hurrians

New Member
Joined
Nov 24, 2023
Messages
1,113
Likes
1,926
Country flag
.same site with ownership transferred
.build other site, all xenforo forums has similar Ui
.will depend on what owners tell/allow
 

Extraordinary

New Member
Joined
Apr 5, 2024
Messages
184
Likes
335
Country flag
Ya'll Nibbiars The reason is simple, no money issues or Legal troubles. But owner got over the site and decided to close down.
He can transfer the ownership and we will take care of the cost.
Resolving legal trouble might be difficult.
 

hurrians

New Member
Joined
Nov 24, 2023
Messages
1,113
Likes
1,926
Country flag
.most of them started teen years
.now grown ups with career
.so was visible when admins were able to focus less here
 

mokoman

New Member
Joined
May 31, 2020
Messages
6,484
Likes
34,873
Country flag
There are 3 million post on this website let's chalk it up to 20 kb for each posts than its not more than.
We will have 60 GBs worth of Text data.
But the space takes are the image files those probably take up several Terabytes.
5 to 7 terabytes of Image easy.
nice looks like it . 2742777 posts .

its there in the post link
 

angryIndian

New Member
Joined
Feb 26, 2013
Messages
1,047
Likes
4,269
Country flag
Oh no!
I grew up with MP.net,Bharat-Rakshak and DFI. It is disheartening to learn that one of the most respected forums in this space is shutting down.
 

shashankk

New Member
Joined
Jan 28, 2018
Messages
835
Likes
4,135
Country flag
Oh no!
I grew up with MP.net,Bharat-Rakshak and DFI. It is disheartening to learn that one of the most respected forums in this space is shutting down.
bhai tu abhi forum world me jinda hai. :) glad to see you .
 
Status
Not open for further replies.

Articles

Top